Abstract: Three dimensional simulations are carried out to estimate the effect of wind direction, wind speed and geometry on the flow and dispersion of vehicular pollutant in a street canyon. The pollutant sources are motor vehicles passing between the two buildings. Suitable emission factors for petrol and diesel vehicles at varying vehicle speed are used for the estimation of the rate of emission from the streets. The dispersion of automobile pollutant released from the street is simulated by introducing vehicular emission source term as a fixed-flux boundary condition at the ground level over the road. The emission source term is suitably calculated by adopting emission factors from literature for varying conditions of street traffic. It is observed that increase in wind angle disturbs the symmetric pattern of pollution distribution along the street length. The concentration increases in the far end of the street as compared to the near end.
Abstract: In this paper, a neural tree (NT) classifier having a
simple perceptron at each node is considered. A new concept for
making a balanced tree is applied in the learning algorithm of the
tree. At each node, if the perceptron classification is not accurate and
unbalanced, then it is replaced by a new perceptron. This separates
the training set in such a way that almost the equal number of patterns
fall into each of the classes. Moreover, each perceptron is trained only
for the classes which are present at respective node and ignore other
classes. Splitting nodes are employed into the neural tree architecture
to divide the training set when the current perceptron node repeats
the same classification of the parent node. A new error function based
on the depth of the tree is introduced to reduce the computational
time for the training of a perceptron. Experiments are performed to
check the efficiency and encouraging results are obtained in terms of
accuracy and computational costs.
Abstract: Food safety is an important concern for holiday
makers in foreign and unfamiliar tourist destinations. In fact, risk
from food in these tourist destinations has an influence on tourist
perception. This risk can potentially affect physical health and lead to
an inability to pursue planned activities. The objective of this paper
was to compare foreign tourists- demographics including gender, age
and education level, with the level of perceived risk towards food
safety. A total of 222 foreign tourists during their stay at Khao San
Road in Bangkok were used as the sample. Independent- samples ttest,
analysis of variance, and Least Significant Difference or LSD
post hoc test were utilized. The findings revealed that there were few
demographic differences in level of perceived risk among the foreign
tourists. The post hoc test indicated a significant difference among
the old and the young tourists, and between the higher and lower
level of education. Ranks of tourists- perceived risk towards food
safety unveiled some interesting results. Tourists- perceived risk of
food safety in established restaurants can be ranked as i) cleanliness
of dining utensils, ii) sanitation of food preparation area, and iii)
cleanliness of food seasoning and ingredients. Whereas, the tourists-
perceived risk of food safety in street food and drink can be ranked
as i) cleanliness of stalls and pushcarts, ii) cleanliness of food sold,
and iii) personal hygiene of street food hawkers or vendors.
Abstract: As in today's semiconductor industries test costs can make up to 50 percent of the total production costs, an efficient test error detection becomes more and more important. In this paper, we present a new machine learning approach to test error detection that should provide a faster recognition of test system faults as well as an improved test error recall. The key idea is to learn a classifier ensemble, detecting typical test error patterns in wafer test results immediately after finishing these tests. Since test error detection has not yet been discussed in the machine learning community, we define central problem-relevant terms and provide an analysis of important domain properties. Finally, we present comparative studies reflecting the failure detection performance of three individual classifiers and three ensemble methods based upon them. As base classifiers we chose a decision tree learner, a support vector machine and a Bayesian network, while the compared ensemble methods were simple and weighted majority vote as well as stacking. For the evaluation, we used cross validation and a specially designed practical simulation. By implementing our approach in a semiconductor test department for the observation of two products, we proofed its practical applicability.
Abstract: An automatic speech recognition system for the
formal Arabic language is needed. The Quran is the most formal
spoken book in Arabic, it is spoken all over the world. In this
research, an automatic speech recognizer for Quranic based speakerindependent
was developed and tested. The system was developed
based on the tri-phone Hidden Markov Model and Maximum
Likelihood Linear Regression (MLLR). The MLLR computes a set
of transformations which reduces the mismatch between an initial
model set and the adaptation data. It uses the regression class tree, as
well as, estimates a set of linear transformations for the mean and
variance parameters of a Gaussian mixture HMM system. The 30th
Chapter of the Quran, with five of the most famous readers of the
Quran, was used for the training and testing of the data. The chapter
includes about 2000 distinct words. The advantages of using the
Quranic verses as the database in this developed recognizer are the
uniqueness of the words and the high level of orderliness between
verses. The level of accuracy from the tested data ranged 68 to 85%.
Abstract: Although backpropagation ANNs generally predict
better than decision trees do for pattern classification problems, they
are often regarded as black boxes, i.e., their predictions cannot be
explained as those of decision trees. In many applications, it is
desirable to extract knowledge from trained ANNs for the users to
gain a better understanding of how the networks solve the problems.
A new rule extraction algorithm, called rule extraction from artificial
neural networks (REANN) is proposed and implemented to extract
symbolic rules from ANNs. A standard three-layer feedforward ANN
is the basis of the algorithm. A four-phase training algorithm is
proposed for backpropagation learning. Explicitness of the extracted
rules is supported by comparing them to the symbolic rules generated
by other methods. Extracted rules are comparable with other methods
in terms of number of rules, average number of conditions for a rule,
and predictive accuracy. Extensive experimental studies on several
benchmarks classification problems, such as breast cancer, iris,
diabetes, and season classification problems, demonstrate the
effectiveness of the proposed approach with good generalization
ability.
Abstract: Large scale systems such as computational Grid is
a distributed computing infrastructure that can provide globally
available network resources. The evolution of information processing
systems in Data Grid is characterized by a strong decentralization of
data in several fields whose objective is to ensure the availability and
the reliability of the data in the reason to provide a fault tolerance
and scalability, which cannot be possible only with the use of the
techniques of replication. Unfortunately the use of these techniques
has a height cost, because it is necessary to maintain consistency
between the distributed data. Nevertheless, to agree to live with
certain imperfections can improve the performance of the system by
improving competition. In this paper, we propose a multi-layer protocol
combining the pessimistic and optimistic approaches conceived
for the data consistency maintenance in large scale systems. Our
approach is based on a hierarchical representation model with tree
layers, whose objective is with double vocation, because it initially
makes it possible to reduce response times compared to completely
pessimistic approach and it the second time to improve the quality
of service compared to an optimistic approach.
Abstract: Ensemble learning algorithms such as AdaBoost and
Bagging have been in active research and shown improvements in
classification results for several benchmarking data sets with mainly
decision trees as their base classifiers. In this paper we experiment to
apply these Meta learning techniques with classifiers such as random
forests, neural networks and support vector machines. The data sets
are from MAGIC, a Cherenkov telescope experiment. The task is to
classify gamma signals from overwhelmingly hadron and muon
signals representing a rare class classification problem. We compare
the individual classifiers with their ensemble counterparts and
discuss the results. WEKA a wonderful tool for machine learning has
been used for making the experiments.
Abstract: This study proposes a novel recommender system to
provide the advertisements of context-aware services. Our proposed
model is designed to apply a modified collaborative filtering (CF)
algorithm with regard to the several dimensions for the personalization
of mobile devices – location, time and the user-s needs type. In
particular, we employ a classification rule to understand user-s needs
type using a decision tree algorithm. In addition, we collect primary
data from the mobile phone users and apply them to the proposed
model to validate its effectiveness. Experimental results show that the
proposed system makes more accurate and satisfactory advertisements
than comparative systems.
Abstract: A sequential decision problem, based on the task ofidentifying the species of trees given acoustic echo data collectedfrom them, is considered with well-known stochastic classifiers,including single and mixture Gaussian models. Echoes are processedwith a preprocessing stage based on a model of mammalian cochlearfiltering, using a new discrete low-pass filter characteristic. Stoppingtime performance of the sequential decision process is evaluated andcompared. It is observed that the new low pass filter processingresults in faster sequential decisions.
Abstract: With optimized bandwidth and latency discrepancy ratios, Node Gain Scores (NGSs) are determined and used as a basis for shaping the max-heap overlay. The NGSs - determined as the respective bandwidth-latency-products - govern the construction of max-heap-form overlays. Each NGS is earned as a synergy of discrepancy ratio of the bandwidth requested with respect to the estimated available bandwidth, and latency discrepancy ratio between the nodes and the source node. The tree leads to enhanceddelivery overlay multicasting – increasing packet delivery which could, otherwise, be hindered by induced packet loss occurring in other schemes not considering the synergy of these parameters on placing the nodes on the overlays. The NGS is a function of four main parameters – estimated available bandwidth, Ba; individual node's requested bandwidth, Br; proposed node latency to its prospective parent (Lp); and suggested best latency as advised by source node (Lb). Bandwidth discrepancy ratio (BDR) and latency discrepancy ratio (LDR) carry weights of α and (1,000 - α ) , respectively, with arbitrary chosen α ranging between 0 and 1,000 to ensure that the NGS values, used as node IDs, maintain a good possibility of uniqueness and balance between the most critical factor between the BDR and the LDR. A max-heap-form tree is constructed with assumption that all nodes possess NGS less than the source node. To maintain a sense of load balance, children of each level's siblings are evenly distributed such that a node can not accept a second child, and so on, until all its siblings able to do so, have already acquired the same number of children. That is so logically done from left to right in a conceptual overlay tree. The records of the pair-wise approximate available bandwidths as measured by a pathChirp scheme at individual nodes are maintained. Evaluation measures as compared to other schemes – Bandwidth Aware multicaSt architecturE (BASE), Tree Building Control Protocol (TBCP), and Host Multicast Tree Protocol (HMTP) - have been conducted. This new scheme generally performs better in terms of trade-off between packet delivery ratio; link stress; control overhead; and end-to-end delays.
Abstract: This paper presents an information retrieval model on
XML documents based on tree matching. Queries and documents are
represented by extended trees. An extended tree is built starting from
the original tree, with additional weighted virtual links between each
node and its indirect descendants allowing to directly reach each
descendant. Therefore only one level separates between each node
and its indirect descendants. This allows to compare the user query
and the document with flexibility and with respect to the structural
constraints of the query. The content of each node is very important to
decide weither a document element is relevant or not, thus the content
should be taken into account in the retrieval process. We separate
between the structure-based and the content-based retrieval processes.
The content-based score of each node is commonly based on the
well-known Tf × Idf criteria. In this paper, we compare between
this criteria and another one we call Tf × Ief. The comparison
is based on some experiments into a dataset provided by INEX1 to
show the effectiveness of our approach on one hand and those of
both weighting functions on the other.
Abstract: Solar sunspot rotation, latitudinal bands are studied based on intelligent computation methods. A combination of image fusion method with together tree decomposition is used to obtain quantitative values about the latitudes of trajectories on sun surface that sunspots rotate around them. Daily solar images taken with SOlar and Heliospheric (SOHO) satellite are fused for each month separately .The result of fused image is decomposed with Quad Tree decomposition method in order to achieve the precise information about latitudes of sunspot trajectories. Such analysis is useful for gathering information about the regions on sun surface and coordinates in space that is more expose to solar geomagnetic storms, tremendous flares and hot plasma gases permeate interplanetary space and help human to serve their technical systems. Here sunspot images in September, November and October in 2001 are used for studying the magnetic behavior of sun.
Abstract: This research presents a system for post processing of
data that takes mined flat rules as input and discovers crisp as well as
fuzzy hierarchical structures using Learning Classifier System
approach. Learning Classifier System (LCS) is basically a machine
learning technique that combines evolutionary computing,
reinforcement learning, supervised or unsupervised learning and
heuristics to produce adaptive systems. A LCS learns by interacting
with an environment from which it receives feedback in the form of
numerical reward. Learning is achieved by trying to maximize the
amount of reward received. Crisp description for a concept usually
cannot represent human knowledge completely and practically. In the
proposed Learning Classifier System initial population is constructed
as a random collection of HPR–trees (related production rules) and
crisp / fuzzy hierarchies are evolved. A fuzzy subsumption relation is
suggested for the proposed system and based on Subsumption Matrix
(SM), a suitable fitness function is proposed. Suitable genetic
operators are proposed for the chosen chromosome representation
method. For implementing reinforcement a suitable reward and
punishment scheme is also proposed. Experimental results are
presented to demonstrate the performance of the proposed system.
Abstract: The early diagnostic decision making in industrial processes is absolutely necessary to produce high quality final products. It helps to provide early warning for a special event in a process, and finding its assignable cause can be obtained. This work presents a hybrid diagnostic schmes for batch processes. Nonlinear representation of raw process data is combined with classification tree techniques. The nonlinear kernel-based dimension reduction is executed for nonlinear classification decision boundaries for fault classes. In order to enhance diagnosis performance for batch processes, filtering of the data is performed to get rid of the irrelevant information of the process data. For the diagnosis performance of several representation, filtering, and future observation estimation methods, four diagnostic schemes are evaluated. In this work, the performance of the presented diagnosis schemes is demonstrated using batch process data.
Abstract: In this paper, we represent protein structure by using
graph. A protein structure database will become a graph database.
Each graph is represented by a spectral vector. We use Jacobi
rotation algorithm to calculate the eigenvalues of the normalized
Laplacian representation of adjacency matrix of graph. To measure
the similarity between two graphs, we calculate the Euclidean
distance between two graph spectral vectors. To cluster the graphs,
we use M-tree with the Euclidean distance to cluster spectral vectors.
Besides, M-tree can be used for graph searching in graph database.
Our proposal method was tested with graph database of 100 graphs
representing 100 protein structures downloaded from Protein Data
Bank (PDB) and we compare the result with the SCOP hierarchical
structure.
Abstract: Historic preservation areas are extremely vulnerable to disasters because they are home to many vulnerable people and contain many closely spaced wooden houses. However, the narrow streets in these regions have historic meaning, which means that they cannot be widened and can become blocked easily during large disasters. Here, we describe our efforts to establish a methodology for the planning of evacuation route sin such historic preservation areas. In particular, this study aims to clarify the effectiveness of measures intended to secure two-way evacuation routes for vulnerable people during large disasters in a historic area preserved under the Cultural Properties Protection Law, Japan.
Abstract: Self-organizing map (SOM) provides both clustering and visualization capabilities in mining data. Dynamic self-organizing maps such as Growing Self-organizing Map (GSOM) has been developed to overcome the problem of fixed structure in SOM to enable better representation of the discovered patterns. However, in mining large datasets or historical data the hierarchical structure of the data is also useful to view the cluster formation at different levels of abstraction. In this paper, we present a technique to generate concept trees from the GSOM. The formation of tree from different spread factor values of GSOM is also investigated and the quality of the trees analyzed. The results show that concept trees can be generated from GSOM, thus, eliminating the need for re-clustering of the data from scratch to obtain a hierarchical view of the data under study.
Abstract: Chronic hepatitis B can evolve to cirrhosis and liver
cancer. Interferon is the only effective treatment, for carefully selected
patients, but it is very expensive. Some of the selection criteria are
based on liver biopsy, an invasive, costly and painful medical procedure.
Therefore, developing efficient non-invasive selection systems,
could be in the patients benefit and also save money. We investigated
the possibility to create intelligent systems to assist the Interferon
therapeutical decision, mainly by predicting with acceptable accuracy
the results of the biopsy. We used a knowledge discovery in integrated
medical data - imaging, clinical, and laboratory data. The resulted
intelligent systems, tested on 500 patients with chronic hepatitis
B, based on C5.0 decision trees and boosting, predict with 100%
accuracy the results of the liver biopsy. Also, by integrating the other
patients selection criteria, they offer a non-invasive support for the
correct Interferon therapeutic decision. To our best knowledge, these
decision systems outperformed all similar systems published in the
literature, and offer a realistic opportunity to replace liver biopsy in
this medical context.
Abstract: The forest stand consisted of four layers. The species
composition between the third and the bottom layers was almost
similar, whereas it was almost exclusive between the top and the lower
three layers. The values of Shannon-s index H' and Pielou-s index
J ' tended to increase from the bottom layer upward, except for
H' -value of the top layer. The values of H' and J ' were 4.21 bit
and 0.73, respectively, for the total stand. High woody species
diversity of the forest depended on large trees in the upper layers,
which trend was different from a subtropical evergreen broadleaf
forest grown in silicate habitat in the northern part of Okinawa Island.
The spatial distribution of trees was overlapped between the third and
the bottom layers, whereas it was independent or slightly exclusive
between the top and the lower three layers. Mean tree weight of each
layer decreased from the top toward the bottom layer, whereas the
corresponding tree density increased from the top downward. This
relationship was analogous to the process of self-thinning plant
populations.