Abstract: This work proposes an approach to address automatic
text summarization. This approach is a trainable summarizer, which
takes into account several features, including sentence position,
positive keyword, negative keyword, sentence centrality, sentence
resemblance to the title, sentence inclusion of name entity, sentence
inclusion of numerical data, sentence relative length, Bushy path of
the sentence and aggregated similarity for each sentence to generate
summaries. First we investigate the effect of each sentence feature on
the summarization task. Then we use all features score function to
train genetic algorithm (GA) and mathematical regression (MR)
models to obtain a suitable combination of feature weights. The
proposed approach performance is measured at several compression
rates on a data corpus composed of 100 English religious articles.
The results of the proposed approach are promising.
Abstract: The objective of this paper is to review and assess the
methodological issues and problems in marketing research, data and
knowledge mining in Turkey. As a summary, academic marketing
research publications in Turkey have significant problems. The most
vital problem seems to be related with modeling. Most of the
publications had major weaknesses in modeling. There were also,
serious problems regarding measurement and scaling, sampling and
analyses. Analyses myopia seems to be the most important problem
for young academia in Turkey. Another very important finding is the
lack of publications on data and knowledge mining in the academic
world.
Abstract: A new approach is adopted in this paper based
on Turk and Pentland-s eigenface method. It was found that the
probability density function of the distance between the projection
vector of the input face image and the average projection vector of
the subject in the face database, follows Rayleigh distribution. In
order to decrease the false acceptance rate and increase the
recognition rate, the input face image has been recognized using two
thresholds including the acceptance threshold and the rejection
threshold. We also find out that the value of two thresholds will be
close to each other as number of trials increases. During the training,
in order to reduce the number of trials, the projection vectors for each
subject has been averaged. The recognition experiments using the
proposed algorithm show that the recognition rate achieves to
92.875% whilst the average number of judgment is only 2.56 times.
Abstract: Text document categorization involves large amount
of data or features. The high dimensionality of features is a
troublesome and can affect the performance of the classification.
Therefore, feature selection is strongly considered as one of the
crucial part in text document categorization. Selecting the best
features to represent documents can reduce the dimensionality of
feature space hence increase the performance. There were many
approaches has been implemented by various researchers to
overcome this problem. This paper proposed a novel hybrid approach
for feature selection in text document categorization based on Ant
Colony Optimization (ACO) and Information Gain (IG). We also
presented state-of-the-art algorithms by several other researchers.
Abstract: This paper presents the development of a wavelet
based algorithm, for distinguishing between magnetizing inrush
currents and power system fault currents, which is quite adequate,
reliable, fast and computationally efficient tool. The proposed
technique consists of a preprocessing unit based on discrete wavelet
transform (DWT) in combination with an artificial neural network
(ANN) for detecting and classifying fault currents. The DWT acts as
an extractor of distinctive features in the input signals at the relay
location. This information is then fed into an ANN for classifying
fault and magnetizing inrush conditions. A 220/55/55 V, 50Hz
laboratory transformer connected to a 380 V power system were
simulated using ATP-EMTP. The DWT was implemented by using
Matlab and Coiflet mother wavelet was used to analyze primary
currents and generate training data. The simulated results presented
clearly show that the proposed technique can accurately discriminate
between magnetizing inrush and fault currents in transformer
protection.
Abstract: Cytogenetic analysis still remains the gold standard method for prenatal diagnosis of trisomy 21 (Down syndrome, DS). Nevertheless, the conventional cytogenetic analysis needs live cultured cells and is too time-consuming for clinical application. In contrast, molecular methods such as FISH, QF-PCR, MLPA and quantitative Real-time PCR are rapid assays with results available in 24h. In the present study, we have successfully used a novel MGB TaqMan probe-based real time PCR assay for rapid diagnosis of trisomy 21 status in Down syndrome samples. We have also compared the results of this molecular method with corresponding results obtained by the cytogenetic analysis. Blood samples obtained from DS patients (n=25) and normal controls (n=20) were tested by quantitative Real-time PCR in parallel to standard G-banding analysis. Genomic DNA was extracted from peripheral blood lymphocytes. A high precision TaqMan probe quantitative Real-time PCR assay was developed to determine the gene dosage of DSCAM (target gene on 21q22.2) relative to PMP22 (reference gene on 17p11.2). The DSCAM/PMP22 ratio was calculated according to the formula; ratio=2 -ΔΔCT. The quantitative Real-time PCR was able to distinguish between trisomy 21 samples and normal controls with the gene ratios of 1.49±0.13 and 1.03±0.04 respectively (p value
Abstract: In this paper a new fast simplification method is
presented. Such method realizes Karnough map with large
number of variables. In order to accelerate the operation of the
proposed method, a new approach for fast detection of group
of ones is presented. Such approach implemented in the
frequency domain. The search operation relies on performing
cross correlation in the frequency domain rather than time one.
It is proved mathematically and practically that the number of
computation steps required for the presented method is less
than that needed by conventional cross correlation. Simulation
results using MATLAB confirm the theoretical computations.
Furthermore, a powerful solution for realization of complex
functions is given. The simplified functions are implemented
by using a new desigen for neural networks. Neural networks
are used because they are fault tolerance and as a result they
can recognize signals even with noise or distortion. This is
very useful for logic functions used in data and computer
communications. Moreover, the implemented functions are
realized with minimum amount of components. This is done
by using modular neural nets (MNNs) that divide the input
space into several homogenous regions. Such approach is
applied to implement XOR function, 16 logic functions on one
bit level, and 2-bit digital multiplier. Compared to previous
non- modular designs, a clear reduction in the order of
computations and hardware requirements is achieved.
Abstract: In this study, any possible differences between mathematics beliefs and anxiety of prospective elementary mathematics teachers have been investigated according to their gender. In this purpose, 1st, 2nd, 3rd and 4th grade students from a Government University in Turkey were selected as a sample. Mathematics Teaching Anxiety Scale (MATAS) and Beliefs About Mathematics Survey (BAMS) has been used as data collection tools. As a result of the study, it has been observed that prospective male teachers have more instrumentalist approach in learning mathematics than females according to their mathematical beliefs. On the other hand, females have more mathematics teaching anxiety than males especially, for subject knowledge in mathematics and selfconfidence.
Abstract: In this paper, a new learning algorithm based on a
hybrid metaheuristic integrating Differential Evolution (DE) and
Reduced Variable Neighborhood Search (RVNS) is introduced to train
the classification method PROAFTN. To apply PROAFTN, values of
several parameters need to be determined prior to classification. These
parameters include boundaries of intervals and relative weights for
each attribute. Based on these requirements, the hybrid approach,
named DEPRO-RVNS, is presented in this study. In some cases, the
major problem when applying DE to some classification problems
was the premature convergence of some individuals to local optima.
To eliminate this shortcoming and to improve the exploration and
exploitation capabilities of DE, such individuals were set to iteratively
re-explored using RVNS. Based on the generated results on
both training and testing data, it is shown that the performance of
PROAFTN is significantly improved. Furthermore, the experimental
study shows that DEPRO-RVNS outperforms well-known machine
learning classifiers in a variety of problems.
Abstract: Adsorption of CS2 vapors has been studied on
different types of activated carbons obtained from different source
raw materials. The activated carbons have different surface areas and
are associated with varying amounts of the carbon-oxygen surface
groups. The adsorption of CS2 vapors is not directly related to surface
area, but is considerably influenced by the presence of carbonoxygen
surface groups. The adsorption decreases on increasing the
amount of carbon-oxygen surface groups on oxidation and increases
when these surface groups are eliminated on degassing. The
adsorption is maximum in case of the 950°-degassed carbon sample
which is almost completely free of any associated oxygen. The
kinetic data as analysed by Empirical diffusion model and Linear
driving force mass transfer model indicate that the adsorption does
not involve Fickian diffusion but may be considered as a pseudo first
order mass transfer process. The activation energy of adsorption and
isosteric enthalpies of adsorption indicate that the adsorption does not
involve interaction between CS2 and carbon-oxygen surface groups,
but hydrophobic interactions between CS2 and C-C atoms in the
carbon lattice.
Abstract: Estimation of runoff water quality parameters is required to determine appropriate water quality management options. Various models are used to estimate runoff water quality parameters. However, most models provide event-based estimates of water quality parameters for specific sites. The work presented in this paper describes the development of a model that continuously simulates the accumulation and wash-off of water quality pollutants in a catchment. The model allows estimation of pollutants build-up during dry periods and pollutants wash-off during storm events. The model was developed by integrating two individual models; rainfall-runoff model, and catchment water quality model. The rainfall-runoff model is based on the time-area runoff estimation method. The model allows users to estimate the time of concentration using a range of established methods. The model also allows estimation of the continuing runoff losses using any of the available estimation methods (i.e., constant, linearly varying or exponentially varying). Pollutants build-up in a catchment was represented by one of three pre-defined functions; power, exponential, or saturation. Similarly, pollutants wash-off was represented by one of three different functions; power, rating-curve, or exponential. The developed runoff water quality model was set-up to simulate the build-up and wash-off of total suspended solids (TSS), total phosphorus (TP) and total nitrogen (TN). The application of the model was demonstrated using available runoff and TSS field data from road and roof surfaces in the Gold Coast, Australia. The model provided excellent representation of the field data demonstrating the simplicity yet effectiveness of the proposed model.
Abstract: Decision tree algorithms have very important place at
classification model of data mining. In literature, algorithms use
entropy concept or gini index to form the tree. The shape of the
classes and their closeness to each other some of the factors that
affect the performance of the algorithm. In this paper we introduce a
new decision tree algorithm which employs data (attribute) folding
method and variation of the class variables over the branches to be
created. A comparative performance analysis has been held between
the proposed algorithm and C4.5.
Abstract: As a method of expanding a higher-order tensor data to tensor products of vectors we have proposed the Third-order Orthogonal Tensor Product Expansion (3OTPE) that did similar expansion as Higher-Order Singular Value Decomposition (HOSVD). In this paper we provide a computation algorithm to improve our previous method, in which SVD is applied to the matrix that constituted by the contraction of original tensor data and one of the expansion vector obtained. The residual of the improved method is smaller than the previous method, truncating the expanding tensor products to the same number of terms. Moreover, the residual is smaller than HOSVD when applying to color image data. It is able to be confirmed that the computing time of improved method is the same as the previous method and considerably better than HOSVD.
Abstract: In this paper we present an efficient approach for the prediction of two sunspot-related time series, namely the Yearly Sunspot Number and the IR5 Index, that are commonly used for monitoring solar activity. The method is based on exploiting partially recurrent Elman networks and it can be divided into three main steps: the first one consists in a “de-rectification" of the time series under study in order to obtain a new time series whose appearance, similar to a sum of sinusoids, can be modelled by our neural networks much better than the original dataset. After that, we normalize the derectified data so that they have zero mean and unity standard deviation and, finally, train an Elman network with only one input, a recurrent hidden layer and one output using a back-propagation algorithm with variable learning rate and momentum. The achieved results have shown the efficiency of this approach that, although very simple, can perform better than most of the existing solar activity forecasting methods.
Abstract: Scale Invariant Feature Transform (SIFT) has been
widely applied, but extracting SIFT feature is complicated and
time-consuming. In this paper, to meet the demand of the real-time
applications, SIFT is parallelized and optimized on cluster system,
which is named pSIFT. Redundancy storage and communication are
used for boundary data to improve the performance, and before
representation of feature descriptor, data reallocation is adopted to
keep load balance in pSIFT. Experimental results show that pSIFT
achieves good speedup and scalability.
Abstract: The objective of this research was to study the
influence of marketing mix on customers purchasing behavior. A
total of 397 respondents were collected from customers who were the
patronages of the Chatuchak Plaza market. A questionnaire was
utilized as a tool to collect data. Statistics utilized in this research
included frequency, percentage, mean, standard deviation, and
multiple regression analysis. Data were analyzed by using Statistical
Package for the Social Sciences. The findings revealed that the
majority of respondents were male with the age between 25-34 years
old, hold undergraduate degree, married and stay together. The
average income of respondents was between 10,001-20,000 baht. In
terms of occupation, the majority worked for private companies. The
research analysis disclosed that there were three variables of
marketing mix which included price (X2), place (X3), and product
(X1) which had an influence on the frequency of customer
purchasing. These three variables can predict a purchase about 30
percent of the time by using the equation; Y1 = 6.851 + .921(X2) +
.949(X3) + .591(X1). It also found that in terms of marketing mixed,
there were two variables had an influence on the amount of customer
purchasing which were physical characteristic (X6), and the process
(X7). These two variables are 17 percent predictive of a purchasing
by using the equation: Y2 = 2276.88 + 2980.97(X6) + 2188.09(X7).
Abstract: System development life cycle (SDLC) is a
process uses during the development of any system. SDLC
consists of four main phases: analysis, design, implement and
testing. During analysis phase, context diagram and data flow
diagrams are used to produce the process model of a system.
A consistency of the context diagram to lower-level data flow
diagrams is very important in smoothing up developing
process of a system. However, manual consistency check from
context diagram to lower-level data flow diagrams by using a
checklist is time-consuming process. At the same time, the
limitation of human ability to validate the errors is one of the
factors that influence the correctness and balancing of the
diagrams. This paper presents a tool that automates the
consistency check between Data Flow Diagrams (DFDs)
based on the rules of DFDs. The tool serves two purposes: as
an editor to draw the diagrams and as a checker to check the
correctness of the diagrams drawn. The consistency check
from context diagram to lower-level data flow diagrams is
embedded inside the tool to overcome the manual checking
problem.
Abstract: This research explores on the development of the structure of Carbon Credit Registry System those accords to the need of future events in Thailand. This research also explores the big picture of every connected system by referring to the design of each system, the Data Flow Diagram, and the design in term of the system-s data using DES standard. The purpose of this paper is to show how to design the model of each system. Furthermore, this paper can serve as guideline for designing an appropriate Carbon Credit Registry System.
Abstract: Image compression plays a vital role in today-s
communication. The limitation in allocated bandwidth leads to
slower communication. To exchange the rate of transmission in the
limited bandwidth the Image data must be compressed before
transmission. Basically there are two types of compressions, 1)
LOSSY compression and 2) LOSSLESS compression. Lossy
compression though gives more compression compared to lossless
compression; the accuracy in retrievation is less in case of lossy
compression as compared to lossless compression. JPEG, JPEG2000
image compression system follows huffman coding for image
compression. JPEG 2000 coding system use wavelet transform,
which decompose the image into different levels, where the
coefficient in each sub band are uncorrelated from coefficient of
other sub bands. Embedded Zero tree wavelet (EZW) coding exploits
the multi-resolution properties of the wavelet transform to give a
computationally simple algorithm with better performance compared
to existing wavelet transforms. For further improvement of
compression applications other coding methods were recently been
suggested. An ANN base approach is one such method. Artificial
Neural Network has been applied to many problems in image
processing and has demonstrated their superiority over classical
methods when dealing with noisy or incomplete data for image
compression applications. The performance analysis of different
images is proposed with an analysis of EZW coding system with
Error Backpropagation algorithm. The implementation and analysis
shows approximately 30% more accuracy in retrieved image
compare to the existing EZW coding system.
Abstract: The method of gait identification based on the nearest neighbor classification technique with motion similarity assessment by the dynamic time warping is proposed. The model based kinematic motion data, represented by the joints rotations coded by Euler angles and unit quaternions is used. The different pose distance functions in Euler angles and quaternion spaces are considered. To evaluate individual features of the subsequent joints movements during gait cycle, joint selection is carried out. To examine proposed approach database containing 353 gaits of 25 humans collected in motion capture laboratory is used. The obtained results are promising. The classifications, which takes into consideration all joints has accuracy over 91%. Only analysis of movements of hip joints allows to correctly identify gaits with almost 80% precision.