Motivated Support Vector Regression with Structural Prior Knowledge

It-s known that incorporating prior knowledge into support vector regression (SVR) can help to improve the approximation performance. Most of researches are concerned with the incorporation of knowledge in form of numerical relationships. Little work, however, has been done to incorporate the prior knowledge on the structural relationships among the variables (referred as to Structural Prior Knowledge, SPK). This paper explores the incorporation of SPK in SVR by constructing appropriate admissible support vector kernel (SV kernel) based on the properties of reproducing kernel (R.K). Three-levels specifications of SPK are studies with the corresponding sub-levels of prior knowledge that can be considered for the method. These include Hierarchical SPK (HSPK), Interactional SPK (ISPK) consisting of independence, global and local interaction, Functional SPK (FSPK) composed of exterior-FSPK and interior-FSPK. A convenient tool for describing the SPK, namely Description Matrix of SPK is introduced. Subsequently, a new SVR, namely Motivated Support Vector Regression (MSVR) whose structure is motivated in part by SPK, is proposed. Synthetic examples show that it is possible to incorporate a wide variety of SPK and helpful to improve the approximation performance in complex cases. The benefits of MSVR are finally shown on a real-life military application, Air-toground battle simulation, which shows great potential for MSVR to the complex military applications.

Using HMM-based Classifier Adapted to Background Noises with Improved Sounds Features for Audio Surveillance Application

Discrimination between different classes of environmental sounds is the goal of our work. The use of a sound recognition system can offer concrete potentialities for surveillance and security applications. The first paper contribution to this research field is represented by a thorough investigation of the applicability of state-of-the-art audio features in the domain of environmental sound recognition. Additionally, a set of novel features obtained by combining the basic parameters is introduced. The quality of the features investigated is evaluated by a HMM-based classifier to which a great interest was done. In fact, we propose to use a Multi-Style training system based on HMMs: one recognizer is trained on a database including different levels of background noises and is used as a universal recognizer for every environment. In order to enhance the system robustness by reducing the environmental variability, we explore different adaptation algorithms including Maximum Likelihood Linear Regression (MLLR), Maximum A Posteriori (MAP) and the MAP/MLLR algorithm that combines MAP and MLLR. Experimental evaluation shows that a rather good recognition rate can be reached, even under important noise degradation conditions when the system is fed by the convenient set of features.

Regression Test Selection Technique for Multi-Programming Language

Regression testing is a maintenance activity applied to modified software to provide confidence that the changed parts are correct and that the unchanged parts have not been adversely affected by the modifications. Regression test selection techniques reduce the cost of regression testing, by selecting a subset of an existing test suite to use in retesting modified programs. This paper presents the first general regression-test-selection technique, which based on code and allows selecting test cases for any programs written in any programming language. Then it handles incomplete program. We also describe RTSDiff, a regression-test-selection system that implements the proposed technique. The results of the empirical studied that performed in four programming languages java, C#, Cµ and Visual basic show that the efficiency and effective in reducing the size of test suit.

Estimation of Critical Period for Weed Control in Corn in Iran

The critical period for weed control (CPWC) is the period in the crop growth cycle during which weeds must be controlled to prevent unacceptable yield losses. Field studies were conducted in 2005 and 2006 in the University of Birjand at the south east of Iran to determine CPWC of corn using a randomized complete block design with 14 treatments and four replications. The treatments consisted of two different periods of weed interference, a critical weed-free period and a critical time of weed removal, were imposed at V3, V6, V9, V12, V15, and R1 (based on phonological stages of corn development) with a weedy check and a weed-free check. The CPWC was determined with the use of 2.5, 5, 10, 15 and 20% acceptable yield loss levels by non-linear Regression method and fitting Logistic and Gompertz nonlinear equations to relative yield data. The CPWC of corn was from 5- to 15-leaf stage (19-55 DAE) to prevent yield losses of 5%. This period to prevent yield losses of 2.5, 10 and 20% was 4- to 17-leaf stage (14-59 DAE), 6- to 12-leaf stage (25-47 DAE) and 8- to 9-leaf stage (31-36 DAE) respectively. The height and leaf area index of corn were significantly decreased by weed competition in both weed free and weed infested treatments (P

Statistical Optimization of Enzymatic Hydrolysis of Potato (Solanum tuberosum) Starch by Immobilized α-amylase

Enzymatic hydrolysis of starch from natural sources finds potential application in commercial production of alcoholic beverage and bioethanol. In this study the effect of starch concentration, temperature, time and enzyme concentration were studied and optimized for hydrolysis of Potato starch powder (of mesh 80/120) into glucose syrup by immobilized (using Sodium arginate) α-amylase using central composite design. The experimental result on enzymatic hydrolysis of Potato starch was subjected to multiple linear regression analysis using MINITAB 14 software. Positive linear effect of starch concentration, enzyme concentration and time was observed on hydrolysis of Potato starch by α-amylase. The statistical significance of the model was validated by F-test for analysis of variance (p ≤ 0.01). The optimum value of starch concentration, enzyme concentration, temperature, time and were found to be 6% (w/v), 2% (w/v), 40°C and 80min respectively. The maximum glucose yield at optimum condition was 2.34 mg/mL.

A Statistical Approach for Predicting and Optimizing Depth of Cut in AWJ Machining for 6063-T6 Al Alloy

In this paper, a set of experimental data has been used to assess the influence of abrasive water jet (AWJ) process parameters in cutting 6063-T6 aluminum alloy. The process variables considered here include nozzle diameter, jet traverse rate, jet pressure and abrasive flow rate. The effects of these input parameters are studied on depth of cut (h); one of most important characteristics of AWJ. The Taguchi method and regression modeling are used in order to establish the relationships between input and output parameters. The adequacy of the model is evaluated using analysis of variance (ANOVA) technique. In the next stage, the proposed model is embedded into a Simulated Annealing (SA) algorithm to optimize the AWJ process parameters. The objective is to determine a suitable set of process parameters that can produce a desired depth of cut, considering the ranges of the process parameters. Computational results prove the effectiveness of the proposed model and optimization procedure.

Stature Estimation Using Foot and Shoeprint Length of Malaysian Population

Formulation of biological profile is one of the modern roles of forensic anthropologist. The present study was conducted to estimate height using foot and shoeprint length of Malaysian population. The present work can be very useful information in the process of identification of individual in forensic cases based on shoeprint evidence. It can help to narrow down suspects and ease the police investigation. Besides, stature is important parameters in determining the partial identify of unidentified and mutilated bodies. Thus, this study can help the problem encountered in cases of mass disaster, massacre, explosions and assault cases. This is because it is very hard to identify parts of bodies in these cases where people are dismembered and become unrecognizable. Samples in this research were collected from 200 Malaysian adults (100 males and 100 females) with age ranging from 20 to 45 years old. In this research, shoeprint length were measured based on the print of the shoes made from the flat shoes. Other information like gender, foot length and height of subject were also recorded. The data was analyzed using IBM® SPSS Statistics 19 software. Results indicated that, foot length has a strong correlation with stature than shoeprint length for both sides of the feet. However, in the unknown, where the gender was undetermined have shown a better correlation in foot length and shoeprint length parameter compared to males and females analyzed separately. In addition, prediction equations are developed to estimate the stature using linear regression analysis of foot length and shoeprint length. However, foot lengths give better prediction than shoeprint length. 

Does Corporate Governance or Transparency Affect Foreign Direct Investment?

The paper investigates the relationship between the foreign direct investment (FDI) and the corporate governance or transparency by investigating the country-level FDI flows, FDI inward performance, corporate governance and transparency variables. From the regression analysis with Newey-West estimator of 28 country panel data from 1990- 2002, we find strong positive relationships between corporate governance or transparency level of hosting countries and FDI inward performance within hosting countries. A strong positive relationship is found between anti-director rights level or number of analysts of hosting countries and FDI inward performance within hosting countries. Also, we find a positive relationship between the number of analysts of hosting countries and FDI inflows. The empirical results are consistent with stock market liberalizations and corporate governance explanations of reasons for FDI.

Fuzzy Logic Approach to Robust Regression Models of Uncertain Medical Categories

Dichotomization of the outcome by a single cut-off point is an important part of various medical studies. Usually the relationship between the resulted dichotomized dependent variable and explanatory variables is analyzed with linear regression, probit regression or logistic regression. However, in many real-life situations, a certain cut-off point dividing the outcome into two groups is unknown and can be specified only approximately, i.e. surrounded by some (small) uncertainty. It means that in order to have any practical meaning the regression model must be robust to this uncertainty. In this paper, we show that neither the beta in the linear regression model, nor its significance level is robust to the small variations in the dichotomization cut-off point. As an alternative robust approach to the problem of uncertain medical categories, we propose to use the linear regression model with the fuzzy membership function as a dependent variable. This fuzzy membership function denotes to what degree the value of the underlying (continuous) outcome falls below or above the dichotomization cut-off point. In the paper, we demonstrate that the linear regression model of the fuzzy dependent variable can be insensitive against the uncertainty in the cut-off point location. In the paper we present the modeling results from the real study of low hemoglobin levels in infants. We systematically test the robustness of the binomial regression model and the linear regression model with the fuzzy dependent variable by changing the boundary for the category Anemia and show that the behavior of the latter model persists over a quite wide interval.

Analysis of Air Quality in the Outdoor Environment of the City of Messina by an Application of the Pollution Index Method

In this paper is reported an analysis about the outdoor air pollution of the urban centre of the city of Messina. The variations of the most critical pollutants concentrations (PM10, O3, CO, C6H6) and their trends respect of climatic parameters and vehicular traffic have been studied. Linear regressions have been effectuated for representing the relations among the pollutants; the differences between pollutants concentrations on weekend/weekday were also analyzed. In order to evaluate air pollution and its effects on human health, a method for calculating a pollution index was implemented and applied in the urban centre of the city. This index is based on the weighted mean of the most detrimental air pollutants concentrations respect of their limit values for protection of human health. The analyzed data of the polluting substances were collected by the Assessorship of the Environment of the Regional Province of Messina in the year 2004. A statistical analysis of the air quality index trends is also reported.

Application of Company Financial Crisis Early Warning Model- Use of “Financial Reference Database“

In July 1, 2007, Taiwan Stock Exchange (TWSE) on market observation post system (MOPS) adds a new "Financial reference database" for investors to do investment reference. This database as a warning to public offering companies listed on the public financial information and it original within eight targets. In this paper, this database provided by the indicators for the application of company financial crisis early warning model verify that the database provided by the indicator forecast for the financial crisis, whether or not companies have a high accuracy rate as opposed to domestic and foreign scholars have positive results. There is use of Logistic Regression Model application of the financial early warning model, in which no joined back-conditions is the first model, joined it in is the second model, has been taken occurred in the financial crisis of companies to research samples and then business took place before the financial crisis point with T-1 and T-2 sample data to do positive analysis. The results show that this database provided the debt ratio and net per share for the best forecast variables.

Using Support Vector Machine for Prediction Dynamic Voltage Collapse in an Actual Power System

This paper presents dynamic voltage collapse prediction on an actual power system using support vector machines. Dynamic voltage collapse prediction is first determined based on the PTSI calculated from information in dynamic simulation output. Simulations were carried out on a practical 87 bus test system by considering load increase as the contingency. The data collected from the time domain simulation is then used as input to the SVM in which support vector regression is used as a predictor to determine the dynamic voltage collapse indices of the power system. To reduce training time and improve accuracy of the SVM, the Kernel function type and Kernel parameter are considered. To verify the effectiveness of the proposed SVM method, its performance is compared with the multi layer perceptron neural network (MLPNN). Studies show that the SVM gives faster and more accurate results for dynamic voltage collapse prediction compared with the MLPNN.

Categorical Data Modeling: Logistic Regression Software

A Matlab based software for logistic regression is developed to enhance the process of teaching quantitative topics and assist researchers with analyzing wide area of applications where categorical data is involved. The software offers an option of performing stepwise logistic regression to select the most significant predictors. The software includes a feature to detect influential observations in data, and investigates the effect of dropping or misclassifying an observation on a predictor variable. The input data may consist either as a set of individual responses (yes/no) with the predictor variables or as grouped records summarizing various categories for each unique set of predictor variables' values. Graphical displays are used to output various statistical results and to assess the goodness of fit of the logistic regression model. The software recognizes possible convergence constraints when present in data, and the user is notified accordingly.

Limiting Fiber Extensibility as Parameter for Damage in Venous Wall

An inflation–extension test with human vena cava inferior was performed with the aim to fit a material model. The vein was modeled as a thick–walled tube loaded by internal pressure and axial force. The material was assumed to be an incompressible hyperelastic fiber reinforced continuum. Fibers are supposed to be arranged in two families of anti–symmetric helices. Considered anisotropy corresponds to local orthotropy. Used strain energy density function was based on a concept of limiting strain extensibility. The pressurization was comprised by four pre–cycles under physiological venous loading (0 – 4kPa) and four cycles under nonphysiological loading (0 – 21kPa). Each overloading cycle was performed with different value of axial weight. Overloading data were used in regression analysis to fit material model. Considered model did not fit experimental data so good. Especially predictions of axial force failed. It was hypothesized that due to nonphysiological values of loading pressure and different values of axial weight the material was not preconditioned enough and some damage occurred inside the wall. A limiting fiber extensibility parameter Jm was assumed to be in relation to supposed damage. Each of overloading cycles was fitted separately with different values of Jm. Other parameters were held the same. This approach turned out to be successful. Variable value of Jm can describe changes in the axial force – axial stretch response and satisfy pressure – radius dependence simultaneously.

Methodology of Realization for Supervisor and Simulator Dedicated to a Semiconductor Research and Production Factory

In the micro and nano-technology industry, the «clean-rooms» dedicated to manufacturing chip, are equipped with the most sophisticated equipment-tools. There use a large number of resources in according to strict specifications for an optimum working and result. The distribution of «utilities» to the production is assured by teams who use a supervision tool. The studies show the interest to control the various parameters of production or/and distribution, in real time, through a reliable and effective supervision tool. This document looks at a large part of the functions that the supervisor must assure, with complementary functionalities to help the diagnosis and simulation that prove very useful in our case where the supervised installations are complexed and in constant evolution.

Designing of the Heating Process for Fiber- Reinforced Thermoplastics with Middle-Wave Infrared Radiators

Manufacturing components of fiber-reinforced thermoplastics requires three steps: heating the matrix, forming and consolidation of the composite and terminal cooling the matrix. For the heating process a pre-determined temperature distribution through the layers and the thickness of the pre-consolidated sheets is recommended to enable forming mechanism. Thus, a design for the heating process for forming composites with thermoplastic matrices is necessary. To obtain a constant temperature through thickness and width of the sheet, the heating process was analyzed by the help of the finite element method. The simulation models were validated by experiments with resistance thermometers as well as with an infrared camera. Based on the finite element simulation, heating methods for infrared radiators have been developed. Using the numeric simulation many iteration loops are required to determine the process parameters. Hence, the initiation of a model for calculating relevant process parameters started applying regression functions.

The Impact of the Type of Diversification of Listed Construction Enterprises in China on Corporation Performance

The construction industry is the pillar industry in China, accounting for about 6% of the gross domestic product. Along with changes in the external environment of the construction industry in China, the construction firm faces fierce competition. The paper aims to investigate the relationship between diversified types of construction firm and its performance in China. Based on generalist and specialist strategy in organizational ecology, we think a generalist organization can be applied to an enterprise with diversified developments, while specialist groups are extended to professional enterprises .This study takes advantage of annual financial data of listed construction firm to empirically verify the relationship between diversification and corporation performance establishing a regression equation to econometric analysis. We find that: 1) Specialization can significantly improve the level of profitability of listed construction firms, and there is a significant positive relationship with corporate performance; 2) The level of operating performance of listed construction enterprises which engage in unrelated diversification is higher than those with related diversification; 3) The relationship between state-owned construction firms and corporate performance is negative. The more the year of foundation is, the higher performance will be; however, the more the year of being listed, the lower performance will be.

A Comparison of Some Thresholding Selection Methods for Wavelet Regression

In wavelet regression, choosing threshold value is a crucial issue. A too large value cuts too many coefficients resulting in over smoothing. Conversely, a too small threshold value allows many coefficients to be included in reconstruction, giving a wiggly estimate which result in under smoothing. However, the proper choice of threshold can be considered as a careful balance of these principles. This paper gives a very brief introduction to some thresholding selection methods. These methods include: Universal, Sure, Ebays, Two fold cross validation and level dependent cross validation. A simulation study on a variety of sample sizes, test functions, signal-to-noise ratios is conducted to compare their numerical performances using three different noise structures. For Gaussian noise, EBayes outperforms in all cases for all used functions while Two fold cross validation provides the best results in the case of long tail noise. For large values of signal-to-noise ratios, level dependent cross validation works well under correlated noises case. As expected, increasing both sample size and level of signal to noise ratio, increases estimation efficiency.

Predicting the Three Major Dimensions of the Learner-s Emotions from Brainwaves

This paper investigates how the use of machine learning techniques can significantly predict the three major dimensions of learner-s emotions (pleasure, arousal and dominance) from brainwaves. This study has adopted an experimentation in which participants were exposed to a set of pictures from the International Affective Picture System (IAPS) while their electrical brain activity was recorded with an electroencephalogram (EEG). The pictures were already rated in a previous study via the affective rating system Self-Assessment Manikin (SAM) to assess the three dimensions of pleasure, arousal, and dominance. For each picture, we took the mean of these values for all subjects used in this previous study and associated them to the recorded brainwaves of the participants in our study. Correlation and regression analyses confirmed the hypothesis that brainwave measures could significantly predict emotional dimensions. This can be very useful in the case of impassive, taciturn or disabled learners. Standard classification techniques were used to assess the reliability of the automatic detection of learners- three major dimensions from the brainwaves. We discuss the results and the pertinence of such a method to assess learner-s emotions and integrate it into a brainwavesensing Intelligent Tutoring System.

Role of Customers in Stakeholders- Approach in Company Corporate Governance

The purpose of this paper is to explore the relationship between the customers- issues in company corporate governance and the financial performance. At the beginning theoretical background consisting stakeholder theory and corporate governance is presented. On this theoretical background, the empirical research is built, collecting data of 60 Czech joint stock companies- boards considering their relationships with customers. Correlation analysis and multivariate regression analysis were employed to test the sample on two hypotheses. The weak positive correlation between stakeholder approach and the company size was identified. But both hypotheses were not supported, because there was no significant relation of independent variables to financial performance.