Distributed Splay Suffix Arrays: A New Structure for Distributed String Search

As a structure for processing string problem, suffix array is certainly widely-known and extensively-studied. But if the string access pattern follows the “90/10" rule, suffix array can not take advantage of the fact that we often find something that we have just found. Although the splay tree is an efficient data structure for small documents when the access pattern follows the “90/10" rule, it requires many structures and an excessive amount of pointer manipulations for efficiently processing and searching large documents. In this paper, we propose a new and conceptually powerful data structure, called splay suffix arrays (SSA), for string search. This data structure combines the features of splay tree and suffix arrays into a new approach which is suitable to implementation on both conventional and clustered computers.

Identification of Most Frequently Occurring Lexis in Body-enhancement Medicinal Unsolicited Bulk e-mails

e-mail has become an important means of electronic communication but the viability of its usage is marred by Unsolicited Bulk e-mail (UBE) messages. UBE consists of many types like pornographic, virus infected and 'cry-for-help' messages as well as fake and fraudulent offers for jobs, winnings and medicines. UBE poses technical and socio-economic challenges to usage of e-mails. To meet this challenge and combat this menace, we need to understand UBE. Towards this end, the current paper presents a content-based textual analysis of more than 2700 body enhancement medicinal UBE. Technically, this is an application of Text Parsing and Tokenization for an un-structured textual document and we approach it using Bag Of Words (BOW) and Vector Space Document Model techniques. We have attempted to identify the most frequently occurring lexis in the UBE documents that advertise various products for body enhancement. The analysis of such top 100 lexis is also presented. We exhibit the relationship between occurrence of a word from the identified lexis-set in the given UBE and the probability that the given UBE will be the one advertising for fake medicinal product. To the best of our knowledge and survey of related literature, this is the first formal attempt for identification of most frequently occurring lexis in such UBE by its textual analysis. Finally, this is a sincere attempt to bring about alertness against and mitigate the threat of such luring but fake UBE.

Estimation of Skew Angle in Binary Document Images Using Hough Transform

This paper includes two novel techniques for skew estimation of binary document images. These algorithms are based on connected component analysis and Hough transform. Both these methods focus on reducing the amount of input data provided to Hough transform. In the first method, referred as word centroid approach, the centroids of selected words are used for skew detection. In the second method, referred as dilate & thin approach, the selected characters are blocked and dilated to get word blocks and later thinning is applied. The final image fed to Hough transform has the thinned coordinates of word blocks in the image. The methods have been successful in reducing the computational complexity of Hough transform based skew estimation algorithms. Promising experimental results are also provided to prove the effectiveness of the proposed methods.

A Web Oriented Spread Spectrum Watermarking Procedure for MPEG-2 Videos

In the last decade digital watermarking procedures have become increasingly applied to implement the copyright protection of multimedia digital contents distributed on the Internet. To this end, it is worth noting that a lot of watermarking procedures for images and videos proposed in literature are based on spread spectrum techniques. However, some scepticism about the robustness and security of such watermarking procedures has arisen because of some documented attacks which claim to render the inserted watermarks undetectable. On the other hand, web content providers wish to exploit watermarking procedures characterized by flexible and efficient implementations and which can be easily integrated in their existing web services frameworks or platforms. This paper presents how a simple spread spectrum watermarking procedure for MPEG-2 videos can be modified to be exploited in web contexts. To this end, the proposed procedure has been made secure and robust against some well-known and dangerous attacks. Furthermore, its basic scheme has been optimized by making the insertion procedure adaptive with respect to the terminals used to open the videos and the network transactions carried out to deliver them to buyers. Finally, two different implementations of the procedure have been developed: the former is a high performance parallel implementation, whereas the latter is a portable Java and XML based implementation. Thus, the paper demonstrates that a simple spread spectrum watermarking procedure, with limited and appropriate modifications to the embedding scheme, can still represent a valid alternative to many other well-known and more recent watermarking procedures proposed in literature.

Neural-Symbolic Machine-Learning for Knowledge Discovery and Adaptive Information Retrieval

In this paper, a model for an information retrieval system is proposed which takes into account that knowledge about documents and information need of users are dynamic. Two methods are combined, one qualitative or symbolic and the other quantitative or numeric, which are deemed suitable for many clustering contexts, data analysis, concept exploring and knowledge discovery. These two methods may be classified as inductive learning techniques. In this model, they are introduced to build “long term" knowledge about past queries and concepts in a collection of documents. The “long term" knowledge can guide and assist the user to formulate an initial query and can be exploited in the process of retrieving relevant information. The different kinds of knowledge are organized in different points of view. This may be considered an enrichment of the exploration level which is coherent with the concept of document/query structure.

Simulation of Voltage Controlled Tunable All Pass Filter Using LM13700 OTA

In recent years Operational Transconductance Amplifier based high frequency integrated circuits, filters and systems have been widely investigated. The usefulness of OTAs over conventional OP-Amps in the design of both first order and second order active filters are well documented. This paper discusses some of the tunability issues using the Matlab/Simulink® software which are previously unreported for any commercial OTA. Using the simulation results two first order voltage controlled all pass filters with phase tuning capability are proposed.

Using Genetic Algorithm to Improve Information Retrieval Systems

This study investigates the use of genetic algorithms in information retrieval. The method is shown to be applicable to three well-known documents collections, where more relevant documents are presented to users in the genetic modification. In this paper we present a new fitness function for approximate information retrieval which is very fast and very flexible, than cosine similarity fitness function.

A New Approach to Annotate the Text's of the Websites and Documents with a Quite Comprehensive Knowledge Base

Machine-understandable data when strongly interlinked constitutes the basis for the SemanticWeb. Annotating web documents is one of the major techniques for creating metadata on the Web. Annotating websites defines the containing data in a form which is suitable for interpretation by machines. In this paper, we present a new approach to annotate websites and documents by promoting the abstraction level of the annotation process to a conceptual level. By this means, we hope to solve some of the problems of the current annotation solutions.

Research on Applying the Continuity Care Document to Generate a Medical Record with Entry Level

Transferring patient information between medical care sites is necessary to deliver better patient care and to reduce medical cost. So developing of electronic medical records is an important trend for the world.The Continuity of Care Document (CCD) is product of collaboration between CDA and CCR standards. In this study, we will develop a system to generate medical records with entry level based on CCD template module.

Requirements Management in a Distributed Agile Environment

The importance of good requirements engineering is well documented. Agile practices, promoting collaboration and communications, facilitate the elicitation and management of volatile requirements. However, current Agile practices work in a well-defined environment. It is necessary to have a co-located customer. With distributed development it is not always possible to realize this co-location. In this environment a suitable process, possibly supported by tools, is required to support changing requirements. This paper introduces the issues of concern when managing requirements in a distributed environment and describes work done at the Software Technology Research Centre as part of the NOMAD project.

Does the Adoption of IFRS Influence Earnings Management towards Small Positive Profits? Evidence from Emerging Markets

This paper investigates the effect of International Financial Reporting Standards (IFRS) adoption on the frequency of earnings managements towards small positive profits. We focus on two emerging markets IFRS adopters: South Africa and Turkey. We tested our logistic regression using appropriate panelestimation techniques over a sample of 330 South African and 210 Turkish firm-year observations over the period 2002-2008. Our results document that mandatory adoption of IFRS is not associated with a reduction in earnings management towards small positive profits in emerging markets. These results contradict most of the previous findings of the studies conducted in developed countries. Based on the legal system factor, we compare the intensity of earnings management between a code law country (Turkey) and a common law country (South Africa) over the pre and post-adoption periods. Our findings show that the frequency of such earnings management practice increases significantly for the code law country.

Analysing the Elementary Science and Technology Coursebook and Student Workbook in Terms of Constructivism

The curriculum of the primary school science course was redesigned on the basis of constructivism in 2005-2006 academic years, in Turkey. In this context, the name of this course has been changed as “Science and Technology"; and both content and course books, students workbooks for this course have been redesigned in light of constructivism. The aim of this study is to determine whether the Science and Technology course books and student work books for primary school 5th grade are appropriate for the constructivism by evaluating them in terms of the fundamental principles of constructivism. In this study, out of qualitative research methods, documentation technique (i.e. document analysis) is applied; while selecting samples, criterion-sampling is used out of purposeful sampling techniques. When the Science and Technology course book and workbook for the 5th grade in primary education are examined, it is seen that both books complete each other in certain areas. Consequently, it can be claimed that in spite of some inadequate and missing points in the course book and workbook of the primary school Science and Technology course for the 5th grade students, these books are attempted to be designed in terms of the principles of constructivism. To overcome the inadequacies in the books, it can be suggested to redesign them. In addition to them, not to ignore the technology dimension of the course, the activities that encourage the students to prepare projects using technology cycle should be included.

Approaches and Schemes for Storing DTD-Independent XML Data in Relational Databases

The volume of XML data exchange is explosively increasing, and the need for efficient mechanisms of XML data management is vital. Many XML storage models have been proposed for storing XML DTD-independent documents in relational database systems. Benchmarking is the best way to highlight pros and cons of different approaches. In this study, we use a common benchmarking scheme, known as XMark to compare the most cited and newly proposed DTD-independent methods in terms of logical reads, physical I/O, CPU time and duration. We show the effect of Label Path, extracting values and storing in another table and type of join needed for each method's query answering.

A Study of the Variability of Very Low Resolution Characters and the Feasibility of Their Discrimination Using Geometrical Features

Current OCR technology does not allow to accurately recognizing small text images, such as those found in web images. Our goal is to investigate new approaches to recognize very low resolution text images containing antialiased character shapes. This paper presents a preliminary study on the variability of such characters and the feasibility to discriminate them by using geometrical features. In a first stage we analyze the distribution of these features. In a second stage we present a study on the discriminative power for recognizing isolated characters, using various rendering methods and font properties. Finally we present interesting results of our evaluation tests leading to our conclusion and future focus.

Coupled Dynamics in Host-Guest Complex Systems Duplicates Emergent Behavior in the Brain

The ability of the brain to organize information and generate the functional structures we use to act, think and communicate, is a common and easily observable natural phenomenon. In object-oriented analysis, these structures are represented by objects. Objects have been extensively studied and documented, but the process that creates them is not understood. In this work, a new class of discrete, deterministic, dissipative, host-guest dynamical systems is introduced. The new systems have extraordinary self-organizing properties. They can host information representing other physical systems and generate the same functional structures as the brain does. A simple mathematical model is proposed. The new systems are easy to simulate by computer, and measurements needed to confirm the assumptions are abundant and readily available. Experimental results presented here confirm the findings. Applications are many, but among the most immediate are object-oriented engineering, image and voice recognition, search engines, and Neuroscience.

Parkinsons Disease Classification using Neural Network and Feature Selection

In this study, the Multi-Layer Perceptron (MLP)with Back-Propagation learning algorithm are used to classify to effective diagnosis Parkinsons disease(PD).It-s a challenging problem for medical community.Typically characterized by tremor, PD occurs due to the loss of dopamine in the brains thalamic region that results in involuntary or oscillatory movement in the body. A feature selection algorithm along with biomedical test values to diagnose Parkinson disease.Clinical diagnosis is done mostly by doctor-s expertise and experience.But still cases are reported of wrong diagnosis and treatment. Patients are asked to take number of tests for diagnosis.In many cases,not all the tests contribute towards effective diagnosis of a disease.Our work is to classify the presence of Parkinson disease with reduced number of attributes.Original,22 attributes are involved in classify.We use Information Gain to determine the attributes which reduced the number of attributes which is need to be taken from patients.The Artificial neural networks is used to classify the diagnosis of patients.Twenty-Two attributes are reduced to sixteen attributes.The accuracy is in training data set is 82.051% and in the validation data set is 83.333%.

Optimization of Protein Hydrolysate Production Process from Jatropha curcas Cake

This was the first document revealing the investigation of protein hydrolysate production optimization from J. curcas cake. Proximate analysis of raw material showed 18.98% protein, 5.31% ash, 8.52% moisture and 12.18% lipid. The appropriate protein hydrolysate production process began with grinding the J. curcas cake into small pieces. Then it was suspended in 2.5% sodium hydroxide solution with ratio between solution/ J. curcas cake at 80:1 (v/w). The hydrolysis reaction was controlled at temperature 50 °C in water bath for 45 minutes. After that, the supernatant (protein hydrolysate) was separated using centrifuge at 8000g for 30 minutes. The maximum yield of resulting protein hydrolysate was 73.27 % with 7.34% moisture, 71.69% total protein, 7.12% lipid, 2.49% ash. The product was also capable of well dissolving in water.

Removal of Hydrogen Sulphide from Air by Means of Fibrous Ion Exchangers

The removal of hydrogen sulphide is required for reasons of health, odour problems, safety and corrosivity problems. The means of removing hydrogen sulphide mainly depend on its concentration and kind of medium to be purified. The paper deals with a method of hydrogen sulphide removal from the air by its catalytic oxidation to elemental sulphur with the use of Fe-EDTA complex. The possibility of obtaining fibrous filtering materials able to remove small concentrations of H2S from the air were described. The base of these materials is fibrous ion exchanger with Fe(III)- EDTA complex immobilized on their functional groups. The complex of trivalent iron converts hydrogen sulphide to elemental sulphur. Bivalent iron formed in the reaction is oxidized by the atmospheric oxygen, so complex of trivalent iron is continuously regenerated and the overall process can be accounted as pseudocatalytic. In the present paper properties of several fibrous catalysts based on ion exchangers with different chemical nature (weak acid,weak base and strong base) were described. It was shown that the main parameters affecting the process of catalytic oxidation are:concentration of hydrogen sulphide in the air, relative humidity of the purified air, the process time and the content of Fe-EDTA complex in the fibres. The data presented show that the filtering layers with anion exchange package are much more active in the catalytic processes of hydrogen sulphide removal than cation exchanger and inert materials. In the addition to the nature of the fibres relative air humidity is a critical factor determining efficiency of the material in the air purification from H2S. It was proved that the most promising carrier of the Fe-EDTA catalyst for hydrogen sulphide oxidation are Fiban A-6 and Fiban AK-22 fibres.

Usability Evaluation of Online News Websites: A User Perspective Approach

Online news websites are one of the main and wide areas of Mass Media. Since the nineties several Jordanian newspapers were introduced to the World Wide Web to reach various and large numbers of audiances. Examples of these newspapers that have online version are Al-Rai, Ad-Dustor and AlGhad. Other pure online news websites include Ammon and Rum. The main aim of this study is to evaluate online newspaper websites using two assessment measures; usability and web content. This aim is achieved by using a questionnaire based evaluation which is based on the definition of usability and web content in the ISO document as the standard number 9241-part 11. The results are obtained based on 204 audiences- responses. The results of the research showed that the usability factor is relatively good for all Jordanian online newspapers whereas the web content factor is moderate.

Review of the Characteristics of Mahan Garden:One Type of Persian Gardens

Iranians- imagination of heaven, which is the reward of a person-s good deeds during their life, has shown itself in pleasant and green gardens where earthly gardens were made as representations of paradise. Iranians are also quite interested in making their earthly gardens and plantations around their buildings. With Iran-s hot and dry climate with a lack of sufficient water for plantation coverage, it becomes noticeable how important it is to Iranians- art in making gardens. This study, with regard to examples, documents and library studies, investigates the characteristics of Persian gardens. The result shows that elements such as soil, water, plants and layout have been used in forming a unique style of Persian gardens. Bagh-e Shah Zadeh Mahan (Mahan prince garden) is a typical example and has been carefully studied. In this paper I try to investigate and evaluate the characteristics of a Persian garden by means of a descriptive approach.