Abstract: Since primary school trips usually start from home,
attention by many scholars have been focused on the home end for
data gathering. Thereafter category analysis has often been relied
upon when predicting school travel demands. In this paper, school
end was relied on for data gathering and multivariate regression for
future travel demand prediction. 9859 pupils were surveyed by way
of questionnaires at 21 primary schools. The town was divided into 5
zones. The study was carried out in Skudai Town, Malaysia. Based
on the hypothesis that the number of primary school trip ends are
expected to be the same because school trips are fixed, the choice of
trip end would have inconsequential effect on the outcome. The
study compared empirical data for home and school trip end
productions and attractions. Variance from both data results was
insignificant, although some claims from home based family survey
were found to be grossly exaggerated. Data from the school trip ends
was relied on for travel demand prediction because of its
completeness. Accessibility, trip attraction and trip production were
then related to school trip rates under daylight and dry weather
conditions. The paper concluded that, accessibility is an important
parameter when predicting demand for future school trip rates.
Abstract: Multi-dimensional principal component analysis
(PCA) is the extension of the PCA, which is used widely as the
dimensionality reduction technique in multivariate data analysis, to
handle multi-dimensional data. To calculate the PCA the singular
value decomposition (SVD) is commonly employed by the reason of
its numerical stability. The multi-dimensional PCA can be calculated
by using the higher-order SVD (HOSVD), which is proposed by
Lathauwer et al., similarly with the case of ordinary PCA. In this
paper, we apply the multi-dimensional PCA to the multi-dimensional
medical data including the functional independence measure (FIM)
score, and describe the results of experimental analysis.
Abstract: Several works regarding facial recognition have dealt with methods which identify isolated characteristics of the face or with templates which encompass several regions of it. In this paper a new technique which approaches the problem holistically dispensing with the need to identify geometrical characteristics or regions of the face is introduced. The characterization of a face is achieved by randomly sampling selected attributes of the pixels of its image. From this information we construct a set of data, which correspond to the values of low frequencies, gradient, entropy and another several characteristics of pixel of the image. Generating a set of “p" variables. The multivariate data set with different polynomials minimizing the data fitness error in the minimax sense (L∞ - Norm) is approximated. With the use of a Genetic Algorithm (GA) it is able to circumvent the problem of dimensionality inherent to higher degree polynomial approximations. The GA yields the degree and values of a set of coefficients of the polynomials approximating of the image of a face. By finding a family of characteristic polynomials from several variables (pixel characteristics) for each face (say Fi ) in the data base through a resampling process the system in use, is trained. A face (say F ) is recognized by finding its characteristic polynomials and using an AdaBoost Classifier from F -s polynomials to each of the Fi -s polynomials. The winner is the polynomial family closer to F -s corresponding to target face in data base.
Abstract: In the recent works related with mixture discriminant
analysis (MDA), expectation and maximization (EM) algorithm is
used to estimate parameters of Gaussian mixtures. But, initial values
of EM algorithm affect the final parameters- estimates. Also, when
EM algorithm is applied two times, for the same data set, it can be
give different results for the estimate of parameters and this affect the
classification accuracy of MDA. Forthcoming this problem, we use
Self Organizing Mixture Network (SOMN) algorithm to estimate
parameters of Gaussians mixtures in MDA that SOMN is more robust
when random the initial values of the parameters are used [5]. We
show effectiveness of this method on popular simulated waveform
datasets and real glass data set.
Abstract: Study of soil properties like field capacity (F.C.) and permanent wilting point (P.W.P.) play important roles in study of soil moisture retention curve. Although these parameters can be measured directly, their measurement is difficult and expensive. Pedotransfer functions (PTFs) provide an alternative by estimating soil parameters from more readily available soil data. In this investigation, 70 soil samples were collected from different horizons of 15 soil profiles located in the Ziaran region, Qazvin province, Iran. The data set was divided into two subsets for calibration (80%) and testing (20%) of the models and their normality were tested by Kolmogorov-Smirnov method. Both multivariate regression and artificial neural network (ANN) techniques were employed to develop the appropriate PTFs for predicting soil parameters using easily measurable characteristics of clay, silt, O.C, S.P, B.D and CaCO3. The performance of the multivariate regression and ANN models was evaluated using an independent test data set. In order to evaluate the models, root mean square error (RMSE) and R2 were used. The comparison of RSME for two mentioned models showed that the ANN model gives better estimates of F.C and P.W.P than the multivariate regression model. The value of RMSE and R2 derived by ANN model for F.C and P.W.P were (2.35, 0.77) and (2.83, 0.72), respectively. The corresponding values for multivariate regression model were (4.46, 0.68) and (5.21, 0.64), respectively. Results showed that ANN with five neurons in hidden layer had better performance in predicting soil properties than multivariate regression.
Abstract: Recommender systems are usually regarded as an
important marketing tool in the e-commerce. They use important
information about users to facilitate accurate recommendation. The
information includes user context such as location, time and interest
for personalization of mobile users. We can easily collect information
about location and time because mobile devices communicate with the
base station of the service provider. However, information about user
interest can-t be easily collected because user interest can not be
captured automatically without user-s approval process. User interest
usually represented as a need. In this study, we classify needs into two
types according to prior research. This study investigates the
usefulness of data mining techniques for classifying user need type for
recommendation systems. We employ several data mining techniques
including artificial neural networks, decision trees, case-based
reasoning, and multivariate discriminant analysis. Experimental
results show that CHAID algorithm outperforms other models for
classifying user need type. This study performs McNemar test to
examine the statistical significance of the differences of classification
results. The results of McNemar test also show that CHAID performs
better than the other models with statistical significance.
Abstract: In many industries, control charts is one of the most
frequently used tools for quality management. Hotelling-s T2 is used
widely in multivariate control chart. However, it has little defect when
detecting small or medium process shifts. The use of supplementary
sensitizing rules can improve the performance of detection. This study
applied sensitizing rules for Hotelling-s T2 control chart to improve the
performance of detection. Support vector machines (SVM) classifier
to identify the characteristic or group of characteristics that are
responsible for the signal and to classify the magnitude of the mean
shifts. The experimental results demonstrate that the support vector
machines (SVM) classifier can effectively identify the characteristic
or group of characteristics that caused the process mean shifts and the
magnitude of the shifts.
Abstract: Perspective of food security in 21 century showed
shortage of food that production is faced to vital problem. Food
security strategy is applied longtime method to assess required food.
Meanwhile, nanotechnology revolution changes the world face.
Nanotechnology is adequate method utilize of its characteristics to
decrease environmental problems and possible further access to food
for small farmers. This article will show impact of production and
adoption of nanocrops on food security. Population is researchers of
agricultural research center of Esfahan province. The results of study
show that there was a relationship between uses, conversion,
distribution, and production of nanocrops, operative human
resources, operative circumstance, and constrains of usage of
nanocrops and food security. Multivariate regression analysis by
enter model shows that operative circumstance, use, production and
constrains of usage of nanocrops had positive impact on food security
and they determine in four steps 20 percent of it.
Abstract: Measurement of competitiveness between countries or regions is an important topic of many economic analysis and scientific papers. In European Union (EU), there is no mainstream approach of competitiveness evaluation and measuring. There are many opinions and methods of measurement and evaluation of competitiveness between states or regions at national and European level. The methods differ in structure of using the indicators of competitiveness and ways of their processing. The aim of the paper is to analyze main sources of competitive potential of the EU Member States with the help of Factor analysis (FA) and to classify the EU Member States to homogeneous units (clusters) according to the similarity of selected indicators of competitiveness factors by Cluster analysis (CA) in reference years 2000 and 2011. The theoretical part of the paper is devoted to the fundamental bases of competitiveness and the methodology of FA and CA methods. The empirical part of the paper deals with the evaluation of competitiveness factors in the EU Member States and cluster comparison of evaluated countries by cluster analysis.
Abstract: Using neural network we try to model the unknown function f for given input-output data pairs. The connection strength of each neuron is updated through learning. Repeated simulations of crisp neural network produce different values of weight factors that are directly affected by the change of different parameters. We propose the idea that for each neuron in the network, we can obtain quasi-fuzzy weight sets (QFWS) using repeated simulation of the crisp neural network. Such type of fuzzy weight functions may be applied where we have multivariate crisp input that needs to be adjusted after iterative learning, like claim amount distribution analysis. As real data is subjected to noise and uncertainty, therefore, QFWS may be helpful in the simplification of such complex problems. Secondly, these QFWS provide good initial solution for training of fuzzy neural networks with reduced computational complexity.
Abstract: Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.
Abstract: This paper presents Faults Forecasting System (FFS)
that utilizes statistical forecasting techniques in analyzing process
variables data in order to forecast faults occurrences. FFS is
proposing new idea in detecting faults. Current techniques used in
faults detection are based on analyzing the current status of the
system variables in order to check if the current status is fault or not.
FFS is using forecasting techniques to predict future timing for faults
before it happens. Proposed model is applying subset modeling
strategy and Bayesian approach in order to decrease dimensionality
of the process variables and improve faults forecasting accuracy. A
practical experiment, designed and implemented in Okayama
University, Japan, is implemented, and the comparison shows that
our proposed model is showing high forecasting accuracy and
BEFORE-TIME.
Abstract: Currently, slider process of Hard Disk Drive Industry
become more complex, defective diagnosis for yield improvement
becomes more complicated and time-consumed. Manufacturing data
analysis with data mining approach is widely used for solving that
problem. The existing mining approach from combining of the KMean
clustering, the machine oriented Kruskal-Wallis test and the
multivariate chart were applied for defective diagnosis but it is still
be a semiautomatic diagnosis system. This article aims to modify an
algorithm to support an automatic decision for the existing approach.
Based on the research framework, the new approach can do an
automatic diagnosis and help engineer to find out the defective
factors faster than the existing approach about 50%.
Abstract: A series of microarray experiments produces observations
of differential expression for thousands of genes across multiple
conditions.
Principal component analysis(PCA) has been widely used in
multivariate data analysis to reduce the dimensionality of the data in
order to simplify subsequent analysis and allow for summarization of
the data in a parsimonious manner. PCA, which can be implemented
via a singular value decomposition(SVD), is useful for analysis of
microarray data.
For application of PCA using SVD we use the DNA microarray
data for the small round blue cell tumors(SRBCT) of childhood
by Khan et al.(2001). To decide the number of components which
account for sufficient amount of information we draw scree plot.
Biplot, a graphic display associated with PCA, reveals important
features that exhibit relationship between variables and also the
relationship of variables with observations.
Abstract: Fake finger submission attack is a major problem in fingerprint recognition systems. In this paper, we introduce an aliveness detection method based on multiple static features, which derived from a single fingerprint image. The static features are comprised of individual pore spacing, residual noise and several first order statistics. Specifically, correlation filter is adopted to address individual pore spacing. The multiple static features are useful to reflect the physiological and statistical characteristics of live and fake fingerprint. The classification can be made by calculating the liveness scores from each feature and fusing the scores through a classifier. In our dataset, we compare nine classifiers and the best classification rate at 85% is attained by using a Reduced Multivariate Polynomial classifier. Our approach is faster and more convenient for aliveness check for field applications.
Abstract: The concentrations of As, Hg, Co, Cr and Cd were
tested for each soil sample, and their spatial patterns were analyzed
by the semivariogram approach of geostatistics and geographical
information system technology. Multivariate statistic approaches
(principal component analysis and cluster analysis) were used to
identify heavy metal sources and their spatial pattern. Principal
component analysis coupled with correlation between heavy metals
showed that primary inputs of As, Hg and Cd were due to
anthropogenic while, Co, and Cr were associated with pedogenic
factors. Ordinary kriging was carried out to map the spatial patters of
heavy metals. The high pollution sources evaluated was related with
usage of urban and industrial wastewater. The results of this study
helpful for risk assessment of environmental pollution for decision
making for industrial adjustment and remedy soil pollution.
Abstract: Segmentation techniques based on Active Contour
Models have been strongly benefited from the use of prior information
during their evolution. Shape prior information is captured from
a training set and is introduced in the optimization procedure to
restrict the evolution into allowable shapes. In this way, the evolution
converges onto regions even with weak boundaries. Although
significant effort has been devoted on different ways of capturing
and analyzing prior information, very little thought has been devoted
on the way of combining image information with prior information.
This paper focuses on a more natural way of incorporating the
prior information in the level set framework. For proof of concept
the method is applied on hippocampus segmentation in T1-MR
images. Hippocampus segmentation is a very challenging task, due
to the multivariate surrounding region and the missing boundary
with the neighboring amygdala, whose intensities are identical. The
proposed method, mimics the human segmentation way and thus
shows enhancements in the segmentation accuracy.
Abstract: Independent component analysis (ICA) is a computational method for finding underlying signals or components from multivariate statistical data. The ICA method has been successfully applied in many fields, e.g. in vision research, brain imaging, geological signals and telecommunications. In this paper, we apply the ICA method to an analysis of mass spectra of oligomeric species emerged from aluminium sulphate. Mass spectra are typically complex, because they are linear combinations of spectra from different types of oligomeric species. The results show that ICA can decomposite the spectral components for useful information. This information is essential in developing coagulation phases of water treatment processes.
Abstract: Mathematical justifications are given for a simulation technique of multivariate nonGaussian random processes and fields based on Rosenblatt-s transformation of Gaussian processes. Different types of convergences are given for the approaching sequence. Moreover an original numerical method is proposed in order to solve the functional equation yielding the underlying Gaussian process autocorrelation function.
Abstract: In recent years, copulas have become very popular in
financial research and actuarial science as they are more flexible in
modelling the co-movements and relationships of risk factors as compared
to the conventional linear correlation coefficient by Pearson.
However, a precise estimation of the copula parameters is vital in
order to correctly capture the (possibly nonlinear) dependence structure
and joint tail events. In this study, we employ two optimization
heuristics, namely Differential Evolution and Threshold Accepting to
tackle the parameter estimation of multivariate t distribution models
in the EML approach. Since the evolutionary optimizer does not rely
on gradient search, the EML approach can be applied to estimation of
more complicated copula models such as high-dimensional copulas.
Our experimental study shows that the proposed method provides
more robust and more accurate estimates as compared to the IFM
approach.