Abstract: This study applies nonparametric DEA to analyze Taiwan’s 46 comprehensive and 73 technical universities from 2012 to 2017. The inter-category comparison of efficient universities percentage reveals that, on the whole, private universities outperform public universities in the same category. In addition, comprehensive universities outperform technical universities. However, the trend analyses confirm that facing the challenge of the higher education crisis, performance improvement is much more urgent for PriCU, PubTECH and PriTECH than for PubCU, especially for PriTECH. The crisis in higher education has hit private universities harder than public ones, and technical universities harder than comprehensive ones, and is worsening fast. Moreover, for PubCU, PubTECH, and PriTECH to better their overall operational efficiency, facilitating management efficiency or innovating teaching and research are equally crucial with optimizing operational scale. Conversely, for PriCU, they should, first of all, put more emphasis on scale efficiency improvement to boom their efficiencies. In terms of scale efficiency, it is required to together consider pure technical efficiency and scale return, and thus seems no merger combinations can better their efficiencies and simultaneously solve their urgent crisis. That thus suggests PriCU, PubTECH, and PriTECH should take other ways, such as to raise income from outputs other than tuition fees, rather than a merger, to reduce the shock as could as possible and thus improve their scale efficiency. Finally, the robustness test suggests consolidated estimation is a more objective and fair evaluation of university efficiency.
Abstract: This study applies nonparametric data envelopment analysis (DEA) to investigate two cases of educational university mergers. The purpose of this study is by comparing the performance differences between pre-merger and post-merger universities to provide a reference for policy makers and management to solve the higher education crisis in Taiwan. This study finds that it seems, so far, no significantly merger synergies reflecting in efficiencies improvement are found from the two cases of post-merger in Taiwan. National Pingtung University (NPTU) is still technical efficiency university after merger. Their efficiency scores are always 1.0 from 2012 to 2017, except 2014. Though, National Tsing Hua University (NTHU) suffers from decay of efficiency scores after merger; their technical efficiency, pure technical efficiency and scale efficiency all dropped after merger.
Abstract: Accurate determination of wind turbine performance is necessary for economic operation of a wind farm. At present, the procedure to carry out the power performance verification of wind turbines is based on a standard of the International Electrotechnical Commission (IEC). In this paper, nonparametric statistical inference is applied to designing a simple, inexpensive method of verifying the power performance of a wind turbine. A statistical test is explained, examined, and the adequacy is tested over real data. The methods use the information that is collected by the SCADA system (Supervisory Control and Data Acquisition) from the sensors embedded in the wind turbines in order to carry out the power performance verification of a wind farm. The study has used data on the monthly output of wind farm in the Republic of Macedonia, and the time measuring interval was from January 1, 2016, to December 31, 2016. At the end, it is concluded whether the power performance of a wind turbine differed significantly from what would be expected. The results of the implementation of the proposed methods showed that the power performance of the specific wind farm under assessment was acceptable.
Abstract: Nowadays websites provide a vast number of resources for users. Recommender systems have been developed as an essential element of these websites to provide a personalized environment for users. They help users to retrieve interested resources from large sets of available resources. Due to the dynamic feature of user preference, constructing an appropriate model to estimate the user preference is the major task of recommender systems. Profile matching and latent factors are two main approaches to identify user preference. In this paper, we employed the latent factor and profile matching to cluster the user profile and identify user preference, respectively. The method uses the Distance Dependent Chines Restaurant Process as a Bayesian nonparametric framework to extract the latent factors from the user profile. These latent factors are mapped to user interests and a weighted distribution is used to identify user preferences. We evaluate the proposed method using a real-world data-set that contains news tweets of a news agency (BBC). The experimental results and comparisons show the superior recommendation accuracy of the proposed approach related to existing methods, and its ability to effectively evolve over time.
Abstract: One of the biggest challenges in nonparametric
regression is the curse of dimensionality. Additive models are known
to overcome this problem by estimating only the individual additive
effects of each covariate. However, if the model is misspecified, the
accuracy of the estimator compared to the fully nonparametric one
is unknown. In this work the efficiency of completely nonparametric
regression estimators such as the Loess is compared to the estimators
that assume additivity in several situations, including additive and
non-additive regression scenarios. The comparison is done by
computing the oracle mean square error of the estimators with regards
to the true nonparametric regression function. Then, a backward
elimination selection procedure based on the Akaike Information
Criteria is proposed, which is computed from either the additive or
the nonparametric model. Simulations show that if the additive model
is misspecified, the percentage of time it fails to select important
variables can be higher than that of the fully nonparametric approach.
A dimension reduction step is included when nonparametric estimator
cannot be computed due to the curse of dimensionality. Finally, the
Boston housing dataset is analyzed using the proposed backward
elimination procedure and the selected variables are identified.
Abstract: Vibration during machining process is crucial since it affects cutting tool, machine, and workpiece leading to a tool wear, tool breakage, and an unacceptable surface roughness. This paper applies a nonparametric statistical method, single decision tree (SDT), to identify factors affecting on vibration in machining process. Workpiece material (AISI 1045 Steel, AA2024 Aluminum alloy, A48-class30 Gray Cast Iron), cutting tool (conventional, cutting tool with holes in toolholder, cutting tool filled up with epoxy-granite), tool overhang (41-65 mm), spindle speed (630-1000 rpm), feed rate (0.05-0.075 mm/rev) and depth of cut (0.05-0.15 mm) were used as input variables, while vibration was the output parameter. It is concluded that workpiece material is the most important parameters for natural frequency followed by cutting tool and overhang.
Abstract: This research seeks to investigate how the globalisation of fast food has affected students’ food choice. A mixed method approach was used in this research; basically involving quantitative and qualitative methods. The quantitative method uses a self-completion questionnaire to randomly sample one hundred and four students; while the qualitative method uses a semi structured interview technique to survey four students on their knowledge and choice to consume fast food. A cross tabulation of variables and the Kruskal Wallis nonparametric test were used to analyse the quantitative data; while the qualitative data was analysed through deduction of themes, and trends from the interview transcribe. The findings revealed that globalisation has amplified the evolution of fast food, popularising it among students. Its global presence has affected students’ food choice and preference. Price, convenience, taste, and peer influence are some of the major factors affecting students’ choice of fast food. Though, students are familiar with the health effect of fast food and the significance of using food information labels for healthy choice making, their preference of fast food is more than homemade food.
Abstract: Self-driving vehicle require a high level of situational
awareness in order to maneuver safely when driving in real world
condition. This paper presents a LiDAR based real time perception
system that is able to process sensor raw data for multiple target
detection and tracking in dynamic environment. The proposed
algorithm is nonparametric and deterministic that is no assumptions
and priori knowledge are needed from the input data and no
initializations are required. Additionally, the proposed method is
working on the three-dimensional data directly generated by LiDAR
while not scarifying the rich information contained in the domain of
3D. Moreover, a fast and efficient for real time clustering algorithm
is applied based on a radially bounded nearest neighbor (RBNN).
Hungarian algorithm procedure and adaptive Kalman filtering are
used for data association and tracking algorithm. The proposed
algorithm is able to run in real time with average run time of 70ms
per frame.
Abstract: The statistical study has become indispensable for various fields of knowledge. Not any different, in Geotechnics the study of probabilistic and statistical methods has gained power considering its use in characterizing the uncertainties inherent in soil properties. One of the situations where engineers are constantly faced is the definition of a probability distribution that represents significantly the sampled data. To be able to discard bad distributions, goodness-of-fit tests are necessary. In this paper, three non-parametric goodness-of-fit tests are applied to a data set computationally generated to test the goodness-of-fit of them to a series of known distributions. It is shown that the use of normal distribution does not always provide satisfactory results regarding physical and behavioral representation of the modeled parameters.
Abstract: The practical efficient approach is suggested for
estimation of the seismoacoustic sources energy in C-OTDR
monitoring systems. This approach is represents the sequential plan
for confidence estimation both the seismoacoustic sources energy, as
well the absorption coefficient of the soil. The sequential plan
delivers the non-asymptotic guaranteed accuracy of obtained
estimates in the form of non-asymptotic confidence regions with
prescribed sizes. These confidence regions are valid for a finite
sample size when the distributions of the observations are unknown.
Thus, suggested estimates are non-asymptotic and nonparametric,
and also these estimates guarantee the prescribed estimation accuracy
in form of prior prescribed size of confidence regions, and prescribed
confidence coefficient value.
Abstract: A robust sequential nonparametric method is proposed
for adaptation to background noise parameters for real-time. The
distribution of background noise was modelled like to Huber
contamination mixture. The method is designed to operate as an
adaptation-unit, which is included inside a detection subsystem of an
integrated multichannel monitoring system. The proposed method
guarantees the given size of a nonasymptotic confidence set for noise
parameters. Properties of the suggested method are rigorously
proved. The proposed algorithm has been successfully tested in real
conditions of a functioning C-OTDR monitoring system, which was
designed to monitor railways.
Abstract: The problems arising from unbalanced data sets
generally appear in real world applications. Due to unequal class
distribution, many researchers have found that the performance of
existing classifiers tends to be biased towards the majority class. The
k-nearest neighbors’ nonparametric discriminant analysis is a method
that was proposed for classifying unbalanced classes with good
performance. In this study, the methods of discriminant analysis are
of interest in investigating misclassification error rates for classimbalanced
data of three diabetes risk groups. The purpose of this
study was to compare the classification performance between
parametric discriminant analysis and nonparametric discriminant
analysis in a three-class classification of class-imbalanced data of
diabetes risk groups. Data from a project maintaining healthy
conditions for 599 employees of a government hospital in Bangkok
were obtained for the classification problem. The employees were
divided into three diabetes risk groups: non-risk (90%), risk (5%),
and diabetic (5%). The original data including the variables of
diabetes risk group, age, gender, blood glucose, and BMI were
analyzed and bootstrapped for 50 and 100 samples, 599 observations
per sample, for additional estimation of the misclassification error
rate. Each data set was explored for the departure of multivariate
normality and the equality of covariance matrices of the three risk
groups. Both the original data and the bootstrap samples showed nonnormality
and unequal covariance matrices. The parametric linear
discriminant function, quadratic discriminant function, and the
nonparametric k-nearest neighbors’ discriminant function were
performed over 50 and 100 bootstrap samples and applied to the
original data. Searching the optimal classification rule, the choices of
prior probabilities were set up for both equal proportions (0.33: 0.33:
0.33) and unequal proportions of (0.90:0.05:0.05), (0.80: 0.10: 0.10)
and (0.70, 0.15, 0.15). The results from 50 and 100 bootstrap samples
indicated that the k-nearest neighbors approach when k=3 or k=4 and
the defined prior probabilities of non-risk: risk: diabetic as 0.90:
0.05:0.05 or 0.80:0.10:0.10 gave the smallest error rate of
misclassification. The k-nearest neighbors approach would be
suggested for classifying a three-class-imbalanced data of diabetes
risk groups.
Abstract: An adaptive nonparametric method is proposed for
stable real-time detection of seismoacoustic sources in multichannel
C-OTDR systems with a significant number of channels. This
method guarantees given upper boundaries for probabilities of Type I
and Type II errors. Properties of the proposed method are rigorously
proved. The results of practical applications of the proposed method
in a real C-OTDR-system are presented in this report.
Abstract: Traditional document representation for classification
follows Bag of Words (BoW) approach to represent the term weights.
The conventional method uses the Vector Space Model (VSM) to
exploit the statistical information of terms in the documents and they
fail to address the semantic information as well as order of the terms
present in the documents. Although, the phrase based approach
follows the order of the terms present in the documents rather than
semantics behind the word. Therefore, a semantic concept based
approach is used in this paper for enhancing the semantics by
incorporating the ontology information. In this paper a novel method
is proposed to forecast the intraday stock market price directional
movement based on the sentiments from Twitter and money control
news articles. The stock market forecasting is a very difficult and
highly complicated task because it is affected by many factors such
as economic conditions, political events and investor’s sentiment etc.
The stock market series are generally dynamic, nonparametric, noisy
and chaotic by nature. The sentiment analysis along with wisdom of
crowds can automatically compute the collective intelligence of
future performance in many areas like stock market, box office sales
and election outcomes. The proposed method utilizes collective
sentiments for stock market to predict the stock price directional
movements. The collective sentiments in the above social media have
powerful prediction on the stock price directional movements as
up/down by using Granger Causality test.
Abstract: Over the past few years, the online multimedia
collection has grown at a fast pace. Several companies showed
interest to study the different ways to organise the amount of audio
information without the need of human intervention to generate
metadata. In the past few years, many applications have emerged on
the market which are capable of identifying a piece of music in a
short time. Different audio effects and degradation make it much
harder to identify the unknown piece. In this paper, an audio
fingerprinting system which makes use of a non-parametric based
algorithm is presented. Parametric analysis is also performed using
Gaussian Mixture Models (GMMs). The feature extraction methods
employed are the Mel Spectrum Coefficients and the MPEG-7 basic
descriptors. Bin numbers replaced the extracted feature coefficients
during the non-parametric modelling. The results show that nonparametric
analysis offer potential results as the ones mentioned in
the literature.
Abstract: In EFL programs, rating scales used in writing
assessment are often constructed by intuition. Intuition-based scales
tend to provide inaccurate and divisive ratings of learners’ writing
performance. Hence, following an empirical approach, this study
attempted to develop a rating scale for elementary-level writing at an
EFL program in Saudi Arabia. Towards this goal, 98 students’ essays
were scored and then coded using comprehensive taxonomy of
writing constructs and their measures. An automatic linear modeling
was run to find out which measures would best predict essay scores.
A nonparametric ANOVA, the Kruskal-Wallis test, was then used to
determine which measures could best differentiate among scoring
levels. Findings indicated that there were certain measures that could
serve as either good predictors of essay scores or differentiators
among scoring levels, or both. The main conclusion was that a rating
scale can be empirically developed using predictive and
discriminative statistical tests.
Abstract: Climate change will affect various aspects of
hydrological cycle such as rainfall. A change in rainfall will affect
flood magnitude and frequency in future which will affect the design
and operation of hydraulic structures. In this paper, trends in subhourly,
sub-daily, and daily extreme rainfall events from 18 rainfall
stations located in Tasmania, Australia are examined. Two nonparametric
tests (Mann-Kendall and Spearman’s Rho) are applied to
detect trends at 10%, 5%, and 1% significance levels. Sub-hourly (6,
12, 18, and 30 minutes) annual maximum rainfall events have been
found to experience statistically significant upward trends at 10%
level of significance. However, sub-daily durations (1 hour, 3 and 12
hours) exhibit decreasing trends and no trends exists for longer
duration rainfall events (e.g. 24 and 72 hours). Some of the durations
(e.g. 6 minutes and 6 hours) show similar results (with upward
trends) for both the tests. For 12, 18, 60 minutes and 3 hours
durations both the tests show similar downward trends. This finding
has important implication for Tasmania in the design of urban
infrastructure where shorter duration rainfall events are more relevant
for smaller urban catchments such as parking lots, roof catchments
and smaller sub-divisions.
Abstract: Two new algorithms for nonparametric estimation of errors-in-variables models are proposed. The first algorithm is based on penalized regression spline. The spline is represented as a piecewise-linear function and for each linear portion orthogonal regression is estimated. This algorithm is iterative. The second algorithm involves locally weighted regression estimation. When the independent variable is measured with error such estimation is a complex nonlinear optimization problem. The simulation results have shown the advantage of the second algorithm under the assumption that true smoothing parameters values are known. Nevertheless the use of some indexes of fit to smoothing parameters selection gives the similar results and has an oversmoothing effect.
Abstract: The aim of this study was to find out if the special type of exercise with elastic cord can improve the level of postural stability. The exercise programme was conducted twice a week for 3 months. The participants were randomly divided into an experimental group and a control group. The electronic balance board was used for testing of postural stability. All participants trained for 18 hours at the time of experiment without any special form of coordination programme. The experimental group performed 90 minutes plus of coordination exercise. The result showed that differences between pre-test and post-test occurred in the experimental group. It was used the nonparametric Wilcoxon t-test for paired samples (p=0.012; the significance level 95%). We calculated effect size by Cohen´s d. In the experimental group d is 1.96 which indicates a large effect. In the control group d is 0.04 which confirms no significant improvement.
Abstract: This paper presents a nonparametric method to obtain the hazard rate “Bathtub curve” for power system components. The model is a mixture of the three known phases of a component life, the decreasing failure rate (DFR), the constant failure rate (CFR) and the increasing failure rate (IFR) represented by three parametric Weibull models. The parameters are obtained from a simultaneous fitting process of the model to the Kernel nonparametric hazard rate curve. From the Weibull parameters and failure rate curves the useful lifetime and the characteristic lifetime were defined. To demonstrate the model the historic time-to-failure of distribution transformers were used as an example. The resulted “Bathtub curve” shows the failure rate for the equipment lifetime which can be applied in economic and replacement decision models.