Multivariate School Travel Demand Regression Based on Trip Attraction

Since primary school trips usually start from home, attention by many scholars have been focused on the home end for data gathering. Thereafter category analysis has often been relied upon when predicting school travel demands. In this paper, school end was relied on for data gathering and multivariate regression for future travel demand prediction. 9859 pupils were surveyed by way of questionnaires at 21 primary schools. The town was divided into 5 zones. The study was carried out in Skudai Town, Malaysia. Based on the hypothesis that the number of primary school trip ends are expected to be the same because school trips are fixed, the choice of trip end would have inconsequential effect on the outcome. The study compared empirical data for home and school trip end productions and attractions. Variance from both data results was insignificant, although some claims from home based family survey were found to be grossly exaggerated. Data from the school trip ends was relied on for travel demand prediction because of its completeness. Accessibility, trip attraction and trip production were then related to school trip rates under daylight and dry weather conditions. The paper concluded that, accessibility is an important parameter when predicting demand for future school trip rates.

Application of Multi-Dimensional Principal Component Analysis to Medical Data

Multi-dimensional principal component analysis (PCA) is the extension of the PCA, which is used widely as the dimensionality reduction technique in multivariate data analysis, to handle multi-dimensional data. To calculate the PCA the singular value decomposition (SVD) is commonly employed by the reason of its numerical stability. The multi-dimensional PCA can be calculated by using the higher-order SVD (HOSVD), which is proposed by Lathauwer et al., similarly with the case of ordinary PCA. In this paper, we apply the multi-dimensional PCA to the multi-dimensional medical data including the functional independence measure (FIM) score, and describe the results of experimental analysis.

Holistic Face Recognition using Multivariate Approximation, Genetic Algorithms and AdaBoost Classifier: Preliminary Results

Several works regarding facial recognition have dealt with methods which identify isolated characteristics of the face or with templates which encompass several regions of it. In this paper a new technique which approaches the problem holistically dispensing with the need to identify geometrical characteristics or regions of the face is introduced. The characterization of a face is achieved by randomly sampling selected attributes of the pixels of its image. From this information we construct a set of data, which correspond to the values of low frequencies, gradient, entropy and another several characteristics of pixel of the image. Generating a set of “p" variables. The multivariate data set with different polynomials minimizing the data fitness error in the minimax sense (L∞ - Norm) is approximated. With the use of a Genetic Algorithm (GA) it is able to circumvent the problem of dimensionality inherent to higher degree polynomial approximations. The GA yields the degree and values of a set of coefficients of the polynomials approximating of the image of a face. By finding a family of characteristic polynomials from several variables (pixel characteristics) for each face (say Fi ) in the data base through a resampling process the system in use, is trained. A face (say F ) is recognized by finding its characteristic polynomials and using an AdaBoost Classifier from F -s polynomials to each of the Fi -s polynomials. The winner is the polynomial family closer to F -s corresponding to target face in data base.

Self Organizing Mixture Network in Mixture Discriminant Analysis: An Experimental Study

In the recent works related with mixture discriminant analysis (MDA), expectation and maximization (EM) algorithm is used to estimate parameters of Gaussian mixtures. But, initial values of EM algorithm affect the final parameters- estimates. Also, when EM algorithm is applied two times, for the same data set, it can be give different results for the estimate of parameters and this affect the classification accuracy of MDA. Forthcoming this problem, we use Self Organizing Mixture Network (SOMN) algorithm to estimate parameters of Gaussians mixtures in MDA that SOMN is more robust when random the initial values of the parameters are used [5]. We show effectiveness of this method on popular simulated waveform datasets and real glass data set.

Developing Pedotransfer Functions for Estimating Some Soil Properties using Artificial Neural Network and Multivariate Regression Approaches

Study of soil properties like field capacity (F.C.) and permanent wilting point (P.W.P.) play important roles in study of soil moisture retention curve. Although these parameters can be measured directly, their measurement is difficult and expensive. Pedotransfer functions (PTFs) provide an alternative by estimating soil parameters from more readily available soil data. In this investigation, 70 soil samples were collected from different horizons of 15 soil profiles located in the Ziaran region, Qazvin province, Iran. The data set was divided into two subsets for calibration (80%) and testing (20%) of the models and their normality were tested by Kolmogorov-Smirnov method. Both multivariate regression and artificial neural network (ANN) techniques were employed to develop the appropriate PTFs for predicting soil parameters using easily measurable characteristics of clay, silt, O.C, S.P, B.D and CaCO3. The performance of the multivariate regression and ANN models was evaluated using an independent test data set. In order to evaluate the models, root mean square error (RMSE) and R2 were used. The comparison of RSME for two mentioned models showed that the ANN model gives better estimates of F.C and P.W.P than the multivariate regression model. The value of RMSE and R2 derived by ANN model for F.C and P.W.P were (2.35, 0.77) and (2.83, 0.72), respectively. The corresponding values for multivariate regression model were (4.46, 0.68) and (5.21, 0.64), respectively. Results showed that ANN with five neurons in hidden layer had better performance in predicting soil properties than multivariate regression.

Customer Need Type Classification Model using Data Mining Techniques for Recommender Systems

Recommender systems are usually regarded as an important marketing tool in the e-commerce. They use important information about users to facilitate accurate recommendation. The information includes user context such as location, time and interest for personalization of mobile users. We can easily collect information about location and time because mobile devices communicate with the base station of the service provider. However, information about user interest can-t be easily collected because user interest can not be captured automatically without user-s approval process. User interest usually represented as a need. In this study, we classify needs into two types according to prior research. This study investigates the usefulness of data mining techniques for classifying user need type for recommendation systems. We employ several data mining techniques including artificial neural networks, decision trees, case-based reasoning, and multivariate discriminant analysis. Experimental results show that CHAID algorithm outperforms other models for classifying user need type. This study performs McNemar test to examine the statistical significance of the differences of classification results. The results of McNemar test also show that CHAID performs better than the other models with statistical significance.

Support Vector Machines Approach for Detecting the Mean Shifts in Hotelling-s T2 Control Chart with Sensitizing Rules

In many industries, control charts is one of the most frequently used tools for quality management. Hotelling-s T2 is used widely in multivariate control chart. However, it has little defect when detecting small or medium process shifts. The use of supplementary sensitizing rules can improve the performance of detection. This study applied sensitizing rules for Hotelling-s T2 control chart to improve the performance of detection. Support vector machines (SVM) classifier to identify the characteristic or group of characteristics that are responsible for the signal and to classify the magnitude of the mean shifts. The experimental results demonstrate that the support vector machines (SVM) classifier can effectively identify the characteristic or group of characteristics that caused the process mean shifts and the magnitude of the shifts.

Survey of Impact of Production and Adoption of Nanocrops on Food Security

Perspective of food security in 21 century showed shortage of food that production is faced to vital problem. Food security strategy is applied longtime method to assess required food. Meanwhile, nanotechnology revolution changes the world face. Nanotechnology is adequate method utilize of its characteristics to decrease environmental problems and possible further access to food for small farmers. This article will show impact of production and adoption of nanocrops on food security. Population is researchers of agricultural research center of Esfahan province. The results of study show that there was a relationship between uses, conversion, distribution, and production of nanocrops, operative human resources, operative circumstance, and constrains of usage of nanocrops and food security. Multivariate regression analysis by enter model shows that operative circumstance, use, production and constrains of usage of nanocrops had positive impact on food security and they determine in four steps 20 percent of it.

Assessment of EU Competitiveness Factors by Multivariate Methods

Measurement of competitiveness between countries or regions is an important topic of many economic analysis and scientific papers. In European Union (EU), there is no mainstream approach of competitiveness evaluation and measuring. There are many opinions and methods of measurement and evaluation of competitiveness between states or regions at national and European level. The methods differ in structure of using the indicators of competitiveness and ways of their processing. The aim of the paper is to analyze main sources of competitive potential of the EU Member States with the help of Factor analysis (FA) and to classify the EU Member States to homogeneous units (clusters) according to the similarity of selected indicators of competitiveness factors by Cluster analysis (CA) in reference years 2000 and 2011. The theoretical part of the paper is devoted to the fundamental bases of competitiveness and the methodology of FA and CA methods. The empirical part of the paper deals with the evaluation of competitiveness factors in the EU Member States and cluster comparison of evaluated countries by cluster analysis. 

Approximate Bounded Knowledge Extraction Using Type-I Fuzzy Logic

Using neural network we try to model the unknown function f for given input-output data pairs. The connection strength of each neuron is updated through learning. Repeated simulations of crisp neural network produce different values of weight factors that are directly affected by the change of different parameters. We propose the idea that for each neuron in the network, we can obtain quasi-fuzzy weight sets (QFWS) using repeated simulation of the crisp neural network. Such type of fuzzy weight functions may be applied where we have multivariate crisp input that needs to be adjusted after iterative learning, like claim amount distribution analysis. As real data is subjected to noise and uncertainty, therefore, QFWS may be helpful in the simplification of such complex problems. Secondly, these QFWS provide good initial solution for training of fuzzy neural networks with reduced computational complexity.

An AK-Chart for the Non-Normal Data

Traditional multivariate control charts assume that measurement from manufacturing processes follows a multivariate normal distribution. However, this assumption may not hold or may be difficult to verify because not all the measurement from manufacturing processes are normal distributed in practice. This study develops a new multivariate control chart for monitoring the processes with non-normal data. We propose a mechanism based on integrating the one-class classification method and the adaptive technique. The adaptive technique is used to improve the sensitivity to small shift on one-class classification in statistical process control. In addition, this design provides an easy way to allocate the value of type I error so it is easier to be implemented. Finally, the simulation study and the real data from industry are used to demonstrate the effectiveness of the propose control charts.

Faults Forecasting System

This paper presents Faults Forecasting System (FFS) that utilizes statistical forecasting techniques in analyzing process variables data in order to forecast faults occurrences. FFS is proposing new idea in detecting faults. Current techniques used in faults detection are based on analyzing the current status of the system variables in order to check if the current status is fault or not. FFS is using forecasting techniques to predict future timing for faults before it happens. Proposed model is applying subset modeling strategy and Bayesian approach in order to decrease dimensionality of the process variables and improve faults forecasting accuracy. A practical experiment, designed and implemented in Okayama University, Japan, is implemented, and the comparison shows that our proposed model is showing high forecasting accuracy and BEFORE-TIME.

Modified Data Mining Approach for Defective Diagnosis in Hard Disk Drive Industry

Currently, slider process of Hard Disk Drive Industry become more complex, defective diagnosis for yield improvement becomes more complicated and time-consumed. Manufacturing data analysis with data mining approach is widely used for solving that problem. The existing mining approach from combining of the KMean clustering, the machine oriented Kruskal-Wallis test and the multivariate chart were applied for defective diagnosis but it is still be a semiautomatic diagnosis system. This article aims to modify an algorithm to support an automatic decision for the existing approach. Based on the research framework, the new approach can do an automatic diagnosis and help engineer to find out the defective factors faster than the existing approach about 50%.

Principal Component Analysis using Singular Value Decomposition of Microarray Data

A series of microarray experiments produces observations of differential expression for thousands of genes across multiple conditions. Principal component analysis(PCA) has been widely used in multivariate data analysis to reduce the dimensionality of the data in order to simplify subsequent analysis and allow for summarization of the data in a parsimonious manner. PCA, which can be implemented via a singular value decomposition(SVD), is useful for analysis of microarray data. For application of PCA using SVD we use the DNA microarray data for the small round blue cell tumors(SRBCT) of childhood by Khan et al.(2001). To decide the number of components which account for sufficient amount of information we draw scree plot. Biplot, a graphic display associated with PCA, reveals important features that exhibit relationship between variables and also the relationship of variables with observations.

Aliveness Detection of Fingerprints using Multiple Static Features

Fake finger submission attack is a major problem in fingerprint recognition systems. In this paper, we introduce an aliveness detection method based on multiple static features, which derived from a single fingerprint image. The static features are comprised of individual pore spacing, residual noise and several first order statistics. Specifically, correlation filter is adopted to address individual pore spacing. The multiple static features are useful to reflect the physiological and statistical characteristics of live and fake fingerprint. The classification can be made by calculating the liveness scores from each feature and fusing the scores through a classifier. In our dataset, we compare nine classifiers and the best classification rate at 85% is attained by using a Reduced Multivariate Polynomial classifier. Our approach is faster and more convenient for aliveness check for field applications.

Spatial Distribution and Risk Assessment of As, Hg, Co and Cr in Kaveh Industrial City, using Geostatistic and GIS

The concentrations of As, Hg, Co, Cr and Cd were tested for each soil sample, and their spatial patterns were analyzed by the semivariogram approach of geostatistics and geographical information system technology. Multivariate statistic approaches (principal component analysis and cluster analysis) were used to identify heavy metal sources and their spatial pattern. Principal component analysis coupled with correlation between heavy metals showed that primary inputs of As, Hg and Cd were due to anthropogenic while, Co, and Cr were associated with pedogenic factors. Ordinary kriging was carried out to map the spatial patters of heavy metals. The high pollution sources evaluated was related with usage of urban and industrial wastewater. The results of this study helpful for risk assessment of environmental pollution for decision making for industrial adjustment and remedy soil pollution.

Hippocampus Segmentation using a Local Prior Model on its Boundary

Segmentation techniques based on Active Contour Models have been strongly benefited from the use of prior information during their evolution. Shape prior information is captured from a training set and is introduced in the optimization procedure to restrict the evolution into allowable shapes. In this way, the evolution converges onto regions even with weak boundaries. Although significant effort has been devoted on different ways of capturing and analyzing prior information, very little thought has been devoted on the way of combining image information with prior information. This paper focuses on a more natural way of incorporating the prior information in the level set framework. For proof of concept the method is applied on hippocampus segmentation in T1-MR images. Hippocampus segmentation is a very challenging task, due to the multivariate surrounding region and the missing boundary with the neighboring amygdala, whose intensities are identical. The proposed method, mimics the human segmentation way and thus shows enhancements in the segmentation accuracy.

Independent Component Analysis to Mass Spectra of Aluminium Sulphate

Independent component analysis (ICA) is a computational method for finding underlying signals or components from multivariate statistical data. The ICA method has been successfully applied in many fields, e.g. in vision research, brain imaging, geological signals and telecommunications. In this paper, we apply the ICA method to an analysis of mass spectra of oligomeric species emerged from aluminium sulphate. Mass spectra are typically complex, because they are linear combinations of spectra from different types of oligomeric species. The results show that ICA can decomposite the spectral components for useful information. This information is essential in developing coagulation phases of water treatment processes.

Simulation of Sample Paths of Non Gaussian Stationary Random Fields

Mathematical justifications are given for a simulation technique of multivariate nonGaussian random processes and fields based on Rosenblatt-s transformation of Gaussian processes. Different types of convergences are given for the approaching sequence. Moreover an original numerical method is proposed in order to solve the functional equation yielding the underlying Gaussian process autocorrelation function.

EML-Estimation of Multivariate t Copulas with Heuristic Optimization

In recent years, copulas have become very popular in financial research and actuarial science as they are more flexible in modelling the co-movements and relationships of risk factors as compared to the conventional linear correlation coefficient by Pearson. However, a precise estimation of the copula parameters is vital in order to correctly capture the (possibly nonlinear) dependence structure and joint tail events. In this study, we employ two optimization heuristics, namely Differential Evolution and Threshold Accepting to tackle the parameter estimation of multivariate t distribution models in the EML approach. Since the evolutionary optimizer does not rely on gradient search, the EML approach can be applied to estimation of more complicated copula models such as high-dimensional copulas. Our experimental study shows that the proposed method provides more robust and more accurate estimates as compared to the IFM approach.