A Study on Finding Similar Document with Multiple Categories

Searching similar documents and document management subjects have important place in text mining. One of the most important parts of similar document research studies is the process of classifying or clustering the documents. In this study, a similar document search approach that includes discussion of out the case of belonging to multiple categories (multiple categories problem) has been carried. The proposed method that based on Fuzzy Similarity Classification (FSC) has been compared with Rocchio algorithm and naive Bayes method which are widely used in text mining. Empirical results show that the proposed method is quite successful and can be applied effectively. For the second stage, multiple categories vector method based on information of categories regarding to frequency of being seen together has been used. Empirical results show that achievement is increased almost two times, when proposed method is compared with classical approach.

A Multilanguage Source Code Retrieval System Using Structural-Semantic Fingerprints

Source code retrieval is of immense importance in the software engineering field. The complex tasks of retrieving and extracting information from source code documents is vital in the development cycle of the large software systems. The two main subtasks which result from these activities are code duplication prevention and plagiarism detection. In this paper, we propose a Mohamed Amine Ouddan, and Hassane Essafi source code retrieval system based on two-level fingerprint representation, respectively the structural and the semantic information within a source code. A sequence alignment technique is applied on these fingerprints in order to quantify the similarity between source code portions. The specific purpose of the system is to detect plagiarism and duplicated code between programs written in different programming languages belonging to the same class, such as C, Cµ, Java and CSharp. These four languages are supported by the actual version of the system which is designed such that it may be easily adapted for any programming language.

An Advanced Approach Based on Artificial Neural Networks to Identify Environmental Bacteria

Environmental micro-organisms include a large number of taxa and some species that are generally considered nonpathogenic, but can represent a risk in certain conditions, especially for elderly people and immunocompromised individuals. Chemotaxonomic identification techniques are powerful tools for environmental micro-organisms, and cellular fatty acid methyl esters (FAME) content is a powerful fingerprinting identification technique. A system based on an unsupervised artificial neural network (ANN) was set up using the fatty acid profiles of standard bacterial strains, obtained by gas-chromatography, used as learning data. We analysed 45 certified strains belonging to Acinetobacter, Aeromonas, Alcaligenes, Aquaspirillum, Arthrobacter, Bacillus, Brevundimonas, Enterobacter, Flavobacterium, Micrococcus, Pseudomonas, Serratia, Shewanella and Vibrio genera. A set of 79 bacteria isolated from a drinking water line (AMGA, the major water supply system in Genoa) were used as an example for identification compared to standard MIDI method. The resulting ANN output map was found to be a very powerful tool to identify these fresh isolates.

Relevance Feedback within CBIR Systems

We present here the results for a comparative study of some techniques, available in the literature, related to the relevance feedback mechanism in the case of a short-term learning. Only one method among those considered here is belonging to the data mining field which is the K-nearest neighbors algorithm (KNN) while the rest of the methods is related purely to the information retrieval field and they fall under the purview of the following three major axes: Shifting query, Feature Weighting and the optimization of the parameters of similarity metric. As a contribution, and in addition to the comparative purpose, we propose a new version of the KNN algorithm referred to as an incremental KNN which is distinct from the original version in the sense that besides the influence of the seeds, the rate of the actual target image is influenced also by the images already rated. The results presented here have been obtained after experiments conducted on the Wang database for one iteration and utilizing color moments on the RGB space. This compact descriptor, Color Moments, is adequate for the efficiency purposes needed in the case of interactive systems. The results obtained allow us to claim that the proposed algorithm proves good results; it even outperforms a wide range of techniques available in the literature.

Auto Classification for Search Intelligence

This paper proposes an auto-classification algorithm of Web pages using Data mining techniques. We consider the problem of discovering association rules between terms in a set of Web pages belonging to a category in a search engine database, and present an auto-classification algorithm for solving this problem that are fundamentally based on Apriori algorithm. The proposed technique has two phases. The first phase is a training phase where human experts determines the categories of different Web pages, and the supervised Data mining algorithm will combine these categories with appropriate weighted index terms according to the highest supported rules among the most frequent words. The second phase is the categorization phase where a web crawler will crawl through the World Wide Web to build a database categorized according to the result of the data mining approach. This database contains URLs and their categories.

Water Quality and Freshwater Fish Diversity at Khao Luang National Park, Thailand

Water quality and freshwater fish diversity from nine waterfalls at Khao Luang National Park, Thailand was examined. Streams were shallow, fast flowing with clear water and rocky and sandy substrate. The mean water quality of waterfalls at Khao Luang National Park were as following pH 7.50, air temperature 24.27 °C, water temperature 26.37 °C, dissolved oxygen 7.88 mg/l, hardness 4.44-21.33 mg/l, alkalinity 3.55-11.88 mg/(as CaCO3). Twenty fish species were found at Khao Luang National Park belonging to nine families. A cluster analysis of water quality at Khao Luang National Park revealed that waterfalls at Khao Luang National Park were divided into two groups: A and B. Group A composed of two waterfalls (i.e. Aie Kaew and Wangmaipak) that flew to the Gulf of Thailand side. Group B composed of seven waterfalls (i.e. Promlok, Kalom, Nuafa, Suankun, Soidaw, Suanhai, and Thapae) that flew to the Andaman Sea side (Fig. 2) .The Cyprinids represented the major species in all the waterfalls comprising of 45%.

Design, Fabrication and Evaluation of MR Damper

This paper presents the design, fabrication and evaluation of magneto-rheological damper. Semi-active control devices have received significant attention in recent years because they offer the adaptability of active control devices without requiring the associated large power sources. Magneto-Rheological (MR) dampers are semi- active control devices that use MR fluids to produce controllable dampers. They potentially offer highly reliable operation and can be viewed as fail-safe in that they become passive dampers if the control hardware malfunction. The advantage of MR dampers over conventional dampers are that they are simple in construction, compromise between high frequency isolation and natural frequency isolation, they offer semi-active control, use very little power, have very quick response, has few moving parts, have a relax tolerances and direct interfacing with electronics. Magneto- Rheological (MR) fluids are Controllable fluids belonging to the class of active materials that have the unique ability to change dynamic yield stress when acted upon by an electric or magnetic field, while maintaining viscosity relatively constant. This property can be utilized in MR damper where the damping force is changed by changing the rheological properties of the fluid magnetically. MR fluids have a dynamic yield stress over Electro-Rheological fluids (ER) and a broader operational temperature range. The objective of this papert was to study the application of an MR damper to vibration control, design the vibration damper using MR fluids, test and evaluate its performance. In this paper the Rheology and the theory behind MR fluids and their use on vibration control were studied. Then a MR vibration damper suitable for vehicle suspension was designed and fabricated using the MR fluid. The MR damper was tested using a dynamic test rig and the results were obtained in the form of force vs velocity and the force vs displacement plots. The results were encouraging and greatly inspire further research on the topic.

All Proteins Have a Basic Molecular Formula

This study proposes a basic molecular formula for all proteins. A total of 10,739 proteins belonging to 9 different protein groups classified on the basis of their functions were selected randomly. They included enzymes, storage proteins, hormones, signalling proteins, structural proteins, transport proteins, immunoglobulins or antibodies, motor proteins and receptor proteins. After obtaining the protein molecular formula using the ProtParam tool, the H/C, N/C, O/C, and S/C ratios were determined for each randomly selected sample. In this case, H, N, O, and S coefficients were specified per carbon atom. Surprisingly, the results demonstrated that H, N, O, and S coefficients for all 10,739 proteins are similar and highly correlated. This study demonstrates that despite differences in the structure and function, all known proteins have a similar basic molecular formula CnH1.58 ± 0.015nN0.28 ± 0.005nO0.30 ± 0.007nS0.01 ± 0.002n. The total correlation between all coefficients was found to be 0.9999.

A Sustainable Design that Enhance the Quality of Life and Human Behavior's

Public parks are placed high on the research agenda, with many studies addressing their social, economic and environment influences in different countries around the world. They have been recognized as contributors to the physical quality of urban environments. Recently, a broader view of public parks has emerged. This view goes well beyond the traditional value of parks as places for more recreation and visual delight, to depict them as valuable contributors to broader strategic objectives, such as property values, place attractiveness, job opportunities, social belonging, public health, tourist development, and improving the overall quality of life. This research examines the role of public parks in enhancing the quality of human life in Egyptian environment. It measures 'quality of life' in terms of 'human needs' and 'well-being'. This should open ways for policymakers, practitioners, researchers and the public to realize the potentials of public parks towards improving the quality of life.

Mycoflora of Activated Sludge with MBRs in Berlin, Germany

Thirty six samples from each (aerobic and anoxic) activated sludge were collected from two wastewater treatment plants with MBRs in Berlin, Germany. The samples were prepared for count and definition of fungal isolates; these isolates were purified by conventional techniques and identified by microscopic examination. Sixty tow species belonging to 28 genera were isolated from activated sludge samples under aerobic conditions (28 genera and 58 species) and anoxic conditions (26 genera and 52 species). The obtained data show that, Aspergillus was found at 94.4% followed by Penicillium 61.1 %, Fusarium (61.1 %), Trichoderma (44.4 %) and Geotrichum candidum (41.6 %) species were the most prevalent in all activated sludge samples. The study confirmed that fungi can thrive in activated sludge and sporulation, but isolated in different numbers depending on the effect of aeration system. Some fungal species in our study are saprophytic, and other a pathogenic to plants and animals.

Swedish: Being or Becoming? Immigration, National Identity and the Democratic State

This article discusses superordinate national identity as a means for immigrants integration into democratic polities. It is suggested that a superordinate national identity perceived as inclusive, by immigrants and by the native population, would be conducive to such integration. Command of the dominant language of society is seen as most important of the inclusive criteria. Other such criteria are respect of the country's political institutions and feelings of belonging to the country where you live. The argument is supported by data, showing a majority in favour of inclusive criteria for 'Swedishness', from a recent study among 1000 secondary school students of 'Swedish' and non-'Swedish' backgrounds.

Effect of Shared Competences in Industrial Districts on Knowledge Creation and Absorptive Capacity

The literature has argued that firms based in industrial districts enjoy advantages for creating internal knowledge and absorbing external knowledge as a consequence of to the knowledge flows and spillovers that exist in the district. However, empirical evidence to show how belonging to an industrial district affects the business processes of creation and absorption of knowledge is scarce and, moreover, empirical research has not taken into account the influence of variations in the flows of knowledge circulating in each cluster. This study aims to extend empirical evidence on the effect that the stock of shared competencies in industrial districts has on the business processes of creation and absorption of knowledge, through data from an initial study on 952 firms and 35 industrial districts in Spain.

Comparison of Different Solvents and Extraction Methods for Isolation of Phenolic Compounds from Horseradish Roots (Armoracia rusticana)

Horseradish (Armoracia rusticana) is a perennial herb belonging to the Brassicaceae family and contains biologically active substances. The aim of the current research was to determine best method for extraction of phenolic compounds from horseradish roots showing high antiradical activity. Three genotypes (No. 105; No. 106 and variety ‘Turku’) of horseradish roots were extracted with eight different solvents: n-hexane, ethyl acetate, diethyl ether, 2-propanol, acetone, ethanol (95%), ethanol / water / acetic acid (80/20/1 v/v/v) and ethanol / water (80/20 by volume) using two extraction methods (conventional and Soxhlet). As the best solvents ethanol and ethanol / water solutions can be chosen. Although in Soxhlet extracts TPC was higher, scavenging activity of DPPH˙ radicals did not increase. It can be concluded that using Soxhlet extraction method more compounds that are not effective antioxidants.

Calculation of Heating Load for an Apartment Complex with Unit Building Method

As a simple to method estimate the plant heating energy capacity of an apartment complex, a new load calculation method has been proposed. The method which can be called as unit building method, predicts the heating load of the entire complex instead of summing up that of each apartment belonging to complex. Comparison of the unit heating load for various floor sizes between the present method and conventional approach shows a close agreement with dynamic load calculation code. Some additional calculations are performed to demonstrate it-s application examples.

Study of Features for Hand-printed Recognition

The feature extraction method(s) used to recognize hand-printed characters play an important role in ICR applications. In order to achieve high recognition rate for a recognition system, the choice of a feature that suits for the given script is certainly an important task. Even if a new feature required to be designed for a given script, it is essential to know the recognition ability of the existing features for that script. Devanagari script is being used in various Indian languages besides Hindi the mother tongue of majority of Indians. This research examines a variety of feature extraction approaches, which have been used in various ICR/OCR applications, in context to Devanagari hand-printed script. The study is conducted theoretically and experimentally on more that 10 feature extraction methods. The various feature extraction methods have been evaluated on Devanagari hand-printed database comprising more than 25000 characters belonging to 43 alphabets. The recognition ability of the features have been evaluated using three classifiers i.e. k-NN, MLP and SVM.

Sense of Territoriality and Revitalization of Neighborhood Centers in Boshrooyeh City

The role of neighborhood center as semi public (the balance space) is disappeared in bonding between private and public in new urbanism. In this way, a hierarchical principle in the traditional neighborhood center appears to create or develop the conditions for residents` relationships and belonging. This paper evaluates significant of hierarchical principles of the neighborhood center in residents` territoriality and its factors. In this way Miandeh neighborhood center from Boshrooyeh city was determined as a case study area. Results indicated that a hierarchical principle is the best instrument to improve the territoriality as the subcomponent of place belonging in residents. The findings help the urban designer to revitalization the neighborhoods and proceedings in organization of physical space.

Energy Efficient Cooperative Caching in WSN

Wireless sensor networks (WSNs) consist of number of tiny, low cost and low power sensor nodes to monitor some physical phenomenon. The major limitation in these networks is the use of non-rechargeable battery having limited power supply. The main cause of energy consumption in such networks is communication subsystem. This paper presents an energy efficient Cluster Cooperative Caching at Sensor (C3S) based upon grid type clustering. Sensor nodes belonging to the same cluster/grid form a cooperative cache system for the node since the cost for communication with them is low both in terms of energy consumption and message exchanges. The proposed scheme uses cache admission control and utility based data replacement policy to ensure that more useful data is retained in the local cache of a node. Simulation results demonstrate that C3S scheme performs better in various performance metrics than NICoCa which is existing cooperative caching protocol for WSNs.

Maximum Norm Analysis of a Nonmatching Grids Method for Nonlinear Elliptic Boundary Value Problem −Δu = f(u)

We provide a maximum norm analysis of a finite element Schwarz alternating method for a nonlinear elliptic boundary value problem of the form -Δu = f(u), on two overlapping sub domains with non matching grids. We consider a domain which is the union of two overlapping sub domains where each sub domain has its own independently generated grid. The two meshes being mutually independent on the overlap region, a triangle belonging to one triangulation does not necessarily belong to the other one. Under a Lipschitz assumption on the nonlinearity, we establish, on each sub domain, an optimal L∞ error estimate between the discrete Schwarz sequence and the exact solution of the boundary value problem.

Entrepreneur Features as a Competence in the Design of the European Higher Education Area Degrees

This paper aims to explain the project carried out at the University of Cordoba, specifically at the High Polytechnic School in collaboration with two other organizations belonging to the Andalusian Ministry of Innovation, Science and Business: Andalusian Innovation and Development Agency (IDEA agency) [1] and the Territorial Net of Entrepreneurship Support (in Spanish Red Territorial de Apoyo al Emprendedor) [11]. The project is being developed in several stages of which only the first one has already been completed. However, several important preliminary results derive from it, based mainly in the description of the nature of entrepreneurship in the field of university education and its impact on student-s competency as recommended by the European Higher Education Area. Some problems holding back the correct future development will also be shown as derived from the specific context of application of the project.

Quality of Service in Multioperator GPON Access Networks with Triple-Play Services

Recently, in some places, optical-fibre access networks have been used with GPON technology belonging to organizations (in most cases public bodies) that act as neutral operators. These operators simultaneously provide network services to various telecommunications operators that offer integrated voice, data and television services. This situation creates new problems related to quality of service, since the interests of the users are intermingled with the interests of the operators. In this paper, we analyse this problem and consider solutions that make it possible to provide guaranteed quality of service for voice over IP, data services and interactive digital television.