Abstract: Text similarity measurement is a fundamental issue in
many textual applications such as document clustering, classification,
summarization and question answering. However, prevailing approaches
based on Vector Space Model (VSM) more or less suffer
from the limitation of Bag of Words (BOW), which ignores the semantic
relationship among words. Enriching document representation
with background knowledge from Wikipedia is proven to be an effective
way to solve this problem, but most existing methods still
cannot avoid similar flaws of BOW in a new vector space. In this
paper, we propose a novel text similarity measurement which goes
beyond VSM and can find semantic affinity between documents.
Specifically, it is a unified graph model that exploits Wikipedia as
background knowledge and synthesizes both document representation
and similarity computation. The experimental results on two different
datasets show that our approach significantly improves VSM-based
methods in both text clustering and classification.
Abstract: A variety of new technology-based services have
emerged with the development of Information and Communication
Technologies (ICTs). Since technology-based services have technology-driven characteristics, the identification of relationships
between technology-based services and ICTs would give meaningful implications. Thus, this paper proposes an approach for identifying the
relationships between technology-based services and ICTs by
analyzing patent documents. First, business model (BM) patents are
classified into relevant service categories. Second, patent citation
analysis is conducted to investigate the technological linkage and impacts between technology-based services and ICTs at macro level.
Third, as a micro level analysis, patent co-classification analysis is
employed to identify the technological linkage and coverage. The
proposed approach could guide and help managers and designers of
technology-based services to discover the opportunity of the development of new technology-based services in emerging service sectors.
Abstract: One of the biggest problems of SMEs is their tendencies to financial distress because of insufficient finance background. In this study, an Early Warning System (EWS) model based on data mining for financial risk detection is presented. CHAID algorithm has been used for development of the EWS. Developed EWS can be served like a tailor made financial advisor in decision making process of the firms with its automated nature to the ones who have inadequate financial background. Besides, an application of the model implemented which covered 7,853 SMEs based on Turkish Central Bank (TCB) 2007 data. By using EWS model, 31 risk profiles, 15 risk indicators, 2 early warning signals, and 4 financial road maps has been determined for financial risk mitigation.
Abstract: Cryo-electron microscopy (CEM) in combination with
single particle analysis (SPA) is a widely used technique for
elucidating structural details of macromolecular assemblies at closeto-
atomic resolutions. However, development of automated software
for SPA processing is still vital since thousands to millions of
individual particle images need to be processed. Here, we present our
workflow for automated particle picking. Our approach integrates
peak shape analysis to the classical correlation and an iterative
approach to separate macromolecules and background by
classification. This particle selection workflow furthermore provides
a robust means for SPA with little user interaction. Processing
simulated and experimental data assesses performance of the
presented tools.
Abstract: Mobile agents are a powerful approach to develop distributed systems since they migrate to hosts on which they have the resources to execute individual tasks. In a dynamic environment like a peer-to-peer network, Agents have to be generated frequently and dispatched to the network. Thus they will certainly consume a certain amount of bandwidth of each link in the network if there are too many agents migration through one or several links at the same time, they will introduce too much transferring overhead to the links eventually, these links will be busy and indirectly block the network traffic, therefore, there is a need of developing routing algorithms that consider about traffic load. In this paper we seek to create cooperation between a probabilistic manner according to the quality measure of the network traffic situation and the agent's migration decision making to the next hop based on decision tree learning algorithms.
Abstract: Genome profiling (GP), a genotype based technology, which exploits random PCR and temperature gradient gel electrophoresis, has been successful in identification/classification of organisms. In this technology, spiddos (Species identification dots) and PaSS (Pattern similarity score) were employed for measuring the closeness (or distance) between genomes. Based on the closeness (PaSS), we can buildup phylogenetic trees of the organisms. We noticed that the topology of the tree is rather robust against the experimental fluctuation conveyed by spiddos. This fact was confirmed quantitatively in this study by computer-simulation, providing the limit of the reliability of this highly powerful methodology. As a result, we could demonstrate the effectiveness of the GP approach for identification/classification of organisms.
Abstract: An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. In this paper, a set of related HPRs is called a cluster and is represented by a HPR-tree. This paper discusses an algorithm based on cumulative learning scenario for dynamic structuring of clusters. The proposed scheme incrementally incorporates new knowledge into the set of clusters from the previous episodes and also maintains summary of clusters as Synopsis to be used in the future episodes. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested incremental structuring of clusters would be useful in mining data streams.
Abstract: Lateral-torsional buckling (LTB) is one of the
phenomenae controlling the ultimate bending strength of steel Ibeams
carrying distributed loads on top flange. Built-up I-sections
are used as main beams and distributors. This study investigates the
ultimate bending strength of such beams with sections of different
classes including slender elements. The nominal strengths of the
selected beams are calculated for different unsupported lengths
according to the Provisions of the American Institute of Steel
Constructions (AISC-LRFD). These calculations are compared with
results of a nonlinear inelastic study using accurate FE model for this
type of loading. The goal is to investigate the performance of the
provisions for the selected sections. Continuous distributed load at
the top flange of the beams was applied at the FE model.
Imperfections of different values are implemented to the FE model to
examine their effect on the LTB of beams at failure, and hence, their
effect on the ultimate strength of beams. The study also introduces a
procedure for evaluating the performance of the provisions compared
with the accurate FEA results of the selected sections. A simplified
design procedure is given and recommendations for future code
updates are made.
Abstract: Hierarchical classification is a problem with applications in many areas as protein function prediction where the dates are hierarchically structured. Therefore, it is necessary the development of algorithms able to induce hierarchical classification models. This paper presents experimenters using the algorithm for hierarchical classification called Multi-label Hierarchical Classification using a Competitive Neural Network (MHC-CNN). It was tested in ten datasets the Gene Ontology (GO) Cellular Component Domain. The results are compared with the Clus-HMC and Clus-HSC using the hF-Measure.
Abstract: to simulate the phenomenon of electronic transport in semiconductors, we try to adapt a numerical method, often and most frequently it’s that of Monte Carlo. In our work, we applied this method in the case of a ternary alloy semiconductor GaInP in its cubic form; The Calculations are made using a non-parabolic effective-mass energy band model. We consider a band of conduction to three valleys (ΓLX), major of the scattering mechanisms are taken into account in this modeling, as the interactions with the acoustic phonons (elastic collisions) and optics (inelastic collisions). The polar optical phonons cause anisotropic collisions, intra-valleys, very probable in the III-V semiconductors. Other optical phonons, no polar, allow transitions inter-valleys. Initially, we present the full results obtained by the simulation of Monte Carlo in GaInP in stationary regime. We consider thereafter the effects related to the application of an electric field varying according to time, we thus study the transient phenomenon which make their appearance in ternary material
Abstract: In this article we address the problem of mobile robot formation control. Indeed, the most work, in this domain, have studied extensively classical control for keeping a formation of mobile robots. In this work, we design an FLC (Fuzzy logic Controller) controller for separation and bearing control (SBC). Indeed, the leader mobile robot is controlled to follow an arbitrary reference path, and the follower mobile robot use the FSBC (Fuzzy Separation and Bearing Control) to keep constant relative distance and constant angle to the leader robot. The efficiency and simplicity of this control law has been proven by simulation on different situation.
Abstract: The issue of classifying objects into one of predefined
groups when the measured variables are mixed with different types
of variables has been part of interest among statisticians in many
years. Some methods for dealing with such situation have been
introduced that include parametric, semi-parametric and nonparametric
approaches. This paper attempts to discuss on a problem
in classifying a data when the number of measured mixed variables is
larger than the size of the sample. A propose idea that integrates a
dimensionality reduction technique via principal component analysis
and a discriminant function based on the location model is discussed.
The study aims in offering practitioners another potential tool in a
classification problem that is possible to be considered when the
observed variables are mixed and too large.
Abstract: Combined therapy using Interferon and Ribavirin is the standard treatment in patients with chronic hepatitis C. However, the number of responders to this treatment is low, whereas its cost and side effects are high. Therefore, there is a clear need to predict patient’s response to the treatment based on clinical information to protect the patients from the bad drawbacks, Intolerable side effects and waste of money. Different machine learning techniques have been developed to fulfill this purpose. From these techniques are Associative Classification (AC) and Decision Tree (DT). The aim of this research is to compare the performance of these two techniques in the prediction of virological response to the standard treatment of HCV from clinical information. 200 patients treated with Interferon and Ribavirin; were analyzed using AC and DT. 150 cases had been used to train the classifiers and 50 cases had been used to test the classifiers. The experiment results showed that the two techniques had given acceptable results however the best accuracy for the AC reached 92% whereas for DT reached 80%.
Abstract: The internet is constantly expanding. Identifying web
links of interest from web browsers requires users to visit each of the
links listed, individually until a satisfactory link is found, therefore
those users need to evaluate a considerable amount of links before
finding their link of interest; this can be tedious and even
unproductive. By incorporating web assistance, web users could be
benefited from reduced time searching on relevant websites. In this
paper, a rough set approach is presented, which facilitates
classification of unlimited available e-vocabulary, to assist web users
in reducing search times looking for relevant web sites. This
approach includes two methods for identifying relevance data on web
links based on the priority and percentage of relevance. As a result of
these methods, a list of web sites is generated in priority sequence
with an emphasis of the search criteria.
Abstract: A novel feature selection strategy to improve the recognition accuracy on the faces that are affected due to nonuniform illumination, partial occlusions and varying expressions is proposed in this paper. This technique is applicable especially in scenarios where the possibility of obtaining a reliable intra-class probability distribution is minimal due to fewer numbers of training samples. Phase congruency features in an image are defined as the points where the Fourier components of that image are maximally inphase. These features are invariant to brightness and contrast of the image under consideration. This property allows to achieve the goal of lighting invariant face recognition. Phase congruency maps of the training samples are generated and a novel modular feature selection strategy is implemented. Smaller sub regions from a predefined neighborhood within the phase congruency images of the training samples are merged to obtain a large set of features. These features are arranged in the order of increasing distance between the sub regions involved in merging. The assumption behind the proposed implementation of the region merging and arrangement strategy is that, local dependencies among the pixels are more important than global dependencies. The obtained feature sets are then arranged in the decreasing order of discriminating capability using a criterion function, which is the ratio of the between class variance to the within class variance of the sample set, in the PCA domain. The results indicate high improvement in the classification performance compared to baseline algorithms.
Abstract: Skin color can provide a useful and robust cue
for human-related image analysis, such as face detection,
pornographic image filtering, hand detection and tracking,
people retrieval in databases and Internet, etc. The major
problem of such kinds of skin color detection algorithms is
that it is time consuming and hence cannot be applied to a real
time system. To overcome this problem, we introduce a new
fast technique for skin detection which can be applied in a real
time system. In this technique, instead of testing each image
pixel to label it as skin or non-skin (as in classic techniques),
we skip a set of pixels. The reason of the skipping process is
the high probability that neighbors of the skin color pixels are
also skin pixels, especially in adult images and vise versa. The
proposed method can rapidly detect skin and non-skin color
pixels, which in turn dramatically reduce the CPU time
required for the protection process. Since many fast detection
techniques are based on image resizing, we apply our
proposed pixel skipping technique with image resizing to
obtain better results. The performance evaluation of the
proposed skipping and hybrid techniques in terms of the
measured CPU time is presented. Experimental results
demonstrate that the proposed methods achieve better result
than the relevant classic method.
Abstract: Although the field of parametric Pattern Recognition (PR) has been thoroughly studied for over five decades, the use of the Order Statistics (OS) of the distributions to achieve this has not been reported. The pioneering work on using OS for classification was presented in [1] for the Uniform distribution, where it was shown that optimal PR can be achieved in a counter-intuitive manner, diametrically opposed to the Bayesian paradigm, i.e., by comparing the testing sample to a few samples distant from the mean. This must be contrasted with the Bayesian paradigm in which, if we are allowed to compare the testing sample with only a single point in the feature space from each class, the optimal strategy would be to achieve this based on the (Mahalanobis) distance from the corresponding central points, for example, the means. In [2], we showed that the results could be extended for a few symmetric distributions within the exponential family. In this paper, we attempt to extend these results significantly by considering asymmetric distributions within the exponential family, for some of which even the closed form expressions of the cumulative distribution functions are not available. These distributions include the Rayleigh, Gamma and certain Beta distributions. As in [1] and [2], the new scheme, referred to as Classification by Moments of Order Statistics (CMOS), attains an accuracy very close to the optimal Bayes’ bound, as has been shown both theoretically and by rigorous experimental testing.
Abstract: The very nonlinear nature of the generator and system
behaviour following a severe disturbance precludes the use of
classical linear control technique. In this paper, a new approach of
nonlinear control is proposed for transient and steady state stability
analysis of a synchronous generator. The control law of the generator
excitation is derived from the basis of Lyapunov stability criterion.
The overall stability of the system is shown using Lyapunov
technique. The application of the proposed controller to simulated
generator excitation control under a large sudden fault and wide
range of operating conditions demonstrates that the new control
strategy is superior to conventional automatic voltage regulator
(AVR), and show very promising results.
Abstract: In this paper, the Fuzzy Autocatalytic Set (FACS) is
composed into Omega Algebra by embedding the membership value
of fuzzy edge connectivity using the property of transitive affinity.
Then, the Omega Algebra of FACS is a transformation semigroup
which is a special class of semigroup is shown.
Abstract: The success of an electronic system in a System-on- Chip is highly dependent on the efficiency of its interconnection network, which is constructed from routers and channels (the routers move data across the channels between nodes). Since neither classical bus based nor point to point architectures can provide scalable solutions and satisfy the tight power and performance requirements of future applications, the Network-on-Chip (NoC) approach has recently been proposed as a promising solution. Indeed, in contrast to the traditional solutions, the NoC approach can provide large bandwidth with moderate area overhead. The selected topology of the components interconnects plays prime rule in the performance of NoC architecture as well as routing and switching techniques that can be used. In this paper, we present two generic NoC architectures that can be customized to the specific communication needs of an application in order to reduce the area with minimal degradation of the latency of the system. An experimental study is performed to compare these structures with basic NoC topologies represented by 2D mesh, Butterfly-Fat Tree (BFT) and SPIN. It is shown that Cluster mesh (CMesh) and MinRoot schemes achieves significant improvements in network latency and energy consumption with only negligible area overhead and complexity over existing architectures. In fact, in the case of basic NoC topologies, CMesh and MinRoot schemes provides substantial savings in area as well, because they requires fewer routers. The simulation results show that CMesh and MinRoot networks outperforms MESH, BFT and SPIN in main performance metrics.