Automatic Text Summarization

This work proposes an approach to address automatic text summarization. This approach is a trainable summarizer, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentence inclusion of name entity, sentence inclusion of numerical data, sentence relative length, Bushy path of the sentence and aggregated similarity for each sentence to generate summaries. First we investigate the effect of each sentence feature on the summarization task. Then we use all features score function to train genetic algorithm (GA) and mathematical regression (MR) models to obtain a suitable combination of feature weights. The proposed approach performance is measured at several compression rates on a data corpus composed of 100 English religious articles. The results of the proposed approach are promising.

Prospects, Problems of Marketing Research and Data Mining in Turkey

The objective of this paper is to review and assess the methodological issues and problems in marketing research, data and knowledge mining in Turkey. As a summary, academic marketing research publications in Turkey have significant problems. The most vital problem seems to be related with modeling. Most of the publications had major weaknesses in modeling. There were also, serious problems regarding measurement and scaling, sampling and analyses. Analyses myopia seems to be the most important problem for young academia in Turkey. Another very important finding is the lack of publications on data and knowledge mining in the academic world.

The Modified Eigenface Method using Two Thresholds

A new approach is adopted in this paper based on Turk and Pentland-s eigenface method. It was found that the probability density function of the distance between the projection vector of the input face image and the average projection vector of the subject in the face database, follows Rayleigh distribution. In order to decrease the false acceptance rate and increase the recognition rate, the input face image has been recognized using two thresholds including the acceptance threshold and the rejection threshold. We also find out that the value of two thresholds will be close to each other as number of trials increases. During the training, in order to reduce the number of trials, the projection vectors for each subject has been averaged. The recognition experiments using the proposed algorithm show that the recognition rate achieves to 92.875% whilst the average number of judgment is only 2.56 times.

A Proposed Hybrid Approach for Feature Selection in Text Document Categorization

Text document categorization involves large amount of data or features. The high dimensionality of features is a troublesome and can affect the performance of the classification. Therefore, feature selection is strongly considered as one of the crucial part in text document categorization. Selecting the best features to represent documents can reduce the dimensionality of feature space hence increase the performance. There were many approaches has been implemented by various researchers to overcome this problem. This paper proposed a novel hybrid approach for feature selection in text document categorization based on Ant Colony Optimization (ACO) and Information Gain (IG). We also presented state-of-the-art algorithms by several other researchers.

Wavelet based ANN Approach for Transformer Protection

This paper presents the development of a wavelet based algorithm, for distinguishing between magnetizing inrush currents and power system fault currents, which is quite adequate, reliable, fast and computationally efficient tool. The proposed technique consists of a preprocessing unit based on discrete wavelet transform (DWT) in combination with an artificial neural network (ANN) for detecting and classifying fault currents. The DWT acts as an extractor of distinctive features in the input signals at the relay location. This information is then fed into an ANN for classifying fault and magnetizing inrush conditions. A 220/55/55 V, 50Hz laboratory transformer connected to a 380 V power system were simulated using ATP-EMTP. The DWT was implemented by using Matlab and Coiflet mother wavelet was used to analyze primary currents and generate training data. The simulated results presented clearly show that the proposed technique can accurately discriminate between magnetizing inrush and fault currents in transformer protection.

A Novel Multiplex Real-Time PCR Assay Using TaqMan MGB Probes for Rapid Detection of Trisomy 21

Cytogenetic analysis still remains the gold standard method for prenatal diagnosis of trisomy 21 (Down syndrome, DS). Nevertheless, the conventional cytogenetic analysis needs live cultured cells and is too time-consuming for clinical application. In contrast, molecular methods such as FISH, QF-PCR, MLPA and quantitative Real-time PCR are rapid assays with results available in 24h. In the present study, we have successfully used a novel MGB TaqMan probe-based real time PCR assay for rapid diagnosis of trisomy 21 status in Down syndrome samples. We have also compared the results of this molecular method with corresponding results obtained by the cytogenetic analysis. Blood samples obtained from DS patients (n=25) and normal controls (n=20) were tested by quantitative Real-time PCR in parallel to standard G-banding analysis. Genomic DNA was extracted from peripheral blood lymphocytes. A high precision TaqMan probe quantitative Real-time PCR assay was developed to determine the gene dosage of DSCAM (target gene on 21q22.2) relative to PMP22 (reference gene on 17p11.2). The DSCAM/PMP22 ratio was calculated according to the formula; ratio=2 -ΔΔCT. The quantitative Real-time PCR was able to distinguish between trisomy 21 samples and normal controls with the gene ratios of 1.49±0.13 and 1.03±0.04 respectively (p value

Integrating Fast Karnough Map and Modular Neural Networks for Simplification and Realization of Complex Boolean Functions

In this paper a new fast simplification method is presented. Such method realizes Karnough map with large number of variables. In order to accelerate the operation of the proposed method, a new approach for fast detection of group of ones is presented. Such approach implemented in the frequency domain. The search operation relies on performing cross correlation in the frequency domain rather than time one. It is proved mathematically and practically that the number of computation steps required for the presented method is less than that needed by conventional cross correlation. Simulation results using MATLAB confirm the theoretical computations. Furthermore, a powerful solution for realization of complex functions is given. The simplified functions are implemented by using a new desigen for neural networks. Neural networks are used because they are fault tolerance and as a result they can recognize signals even with noise or distortion. This is very useful for logic functions used in data and computer communications. Moreover, the implemented functions are realized with minimum amount of components. This is done by using modular neural nets (MNNs) that divide the input space into several homogenous regions. Such approach is applied to implement XOR function, 16 logic functions on one bit level, and 2-bit digital multiplier. Compared to previous non- modular designs, a clear reduction in the order of computations and hardware requirements is achieved.

Gender Differences of Elementary Prospective Teachers in Mathematical Beliefs and Mathematics Teaching Anxiety

In this study, any possible differences between mathematics beliefs and anxiety of prospective elementary mathematics teachers have been investigated according to their gender. In this purpose, 1st, 2nd, 3rd and 4th grade students from a Government University in Turkey were selected as a sample. Mathematics Teaching Anxiety Scale (MATAS) and Beliefs About Mathematics Survey (BAMS) has been used as data collection tools. As a result of the study, it has been observed that prospective male teachers have more instrumentalist approach in learning mathematics than females according to their mathematical beliefs. On the other hand, females have more mathematics teaching anxiety than males especially, for subject knowledge in mathematics and selfconfidence.

A Hybrid Metaheuristic Framework for Evolving the PROAFTN Classifier

In this paper, a new learning algorithm based on a hybrid metaheuristic integrating Differential Evolution (DE) and Reduced Variable Neighborhood Search (RVNS) is introduced to train the classification method PROAFTN. To apply PROAFTN, values of several parameters need to be determined prior to classification. These parameters include boundaries of intervals and relative weights for each attribute. Based on these requirements, the hybrid approach, named DEPRO-RVNS, is presented in this study. In some cases, the major problem when applying DE to some classification problems was the premature convergence of some individuals to local optima. To eliminate this shortcoming and to improve the exploration and exploitation capabilities of DE, such individuals were set to iteratively re-explored using RVNS. Based on the generated results on both training and testing data, it is shown that the performance of PROAFTN is significantly improved. Furthermore, the experimental study shows that DEPRO-RVNS outperforms well-known machine learning classifiers in a variety of problems.

Adsorptive Removal of Vapors of Toxic Sulfur Compounds using Activated Carbons

Adsorption of CS2 vapors has been studied on different types of activated carbons obtained from different source raw materials. The activated carbons have different surface areas and are associated with varying amounts of the carbon-oxygen surface groups. The adsorption of CS2 vapors is not directly related to surface area, but is considerably influenced by the presence of carbonoxygen surface groups. The adsorption decreases on increasing the amount of carbon-oxygen surface groups on oxidation and increases when these surface groups are eliminated on degassing. The adsorption is maximum in case of the 950°-degassed carbon sample which is almost completely free of any associated oxygen. The kinetic data as analysed by Empirical diffusion model and Linear driving force mass transfer model indicate that the adsorption does not involve Fickian diffusion but may be considered as a pseudo first order mass transfer process. The activation energy of adsorption and isosteric enthalpies of adsorption indicate that the adsorption does not involve interaction between CS2 and carbon-oxygen surface groups, but hydrophobic interactions between CS2 and C-C atoms in the carbon lattice.

Development of a Catchment Water Quality Model for Continuous Simulations of Pollutants Build-up and Wash-off

Estimation of runoff water quality parameters is required to determine appropriate water quality management options. Various models are used to estimate runoff water quality parameters. However, most models provide event-based estimates of water quality parameters for specific sites. The work presented in this paper describes the development of a model that continuously simulates the accumulation and wash-off of water quality pollutants in a catchment. The model allows estimation of pollutants build-up during dry periods and pollutants wash-off during storm events. The model was developed by integrating two individual models; rainfall-runoff model, and catchment water quality model. The rainfall-runoff model is based on the time-area runoff estimation method. The model allows users to estimate the time of concentration using a range of established methods. The model also allows estimation of the continuing runoff losses using any of the available estimation methods (i.e., constant, linearly varying or exponentially varying). Pollutants build-up in a catchment was represented by one of three pre-defined functions; power, exponential, or saturation. Similarly, pollutants wash-off was represented by one of three different functions; power, rating-curve, or exponential. The developed runoff water quality model was set-up to simulate the build-up and wash-off of total suspended solids (TSS), total phosphorus (TP) and total nitrogen (TN). The application of the model was demonstrated using available runoff and TSS field data from road and roof surfaces in the Gold Coast, Australia. The model provided excellent representation of the field data demonstrating the simplicity yet effectiveness of the proposed model.

An Attribute-Centre Based Decision Tree Classification Algorithm

Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In this paper we introduce a new decision tree algorithm which employs data (attribute) folding method and variation of the class variables over the branches to be created. A comparative performance analysis has been held between the proposed algorithm and C4.5.

An Improved Algorithm for Calculation of the Third-order Orthogonal Tensor Product Expansion by Using Singular Value Decomposition

As a method of expanding a higher-order tensor data to tensor products of vectors we have proposed the Third-order Orthogonal Tensor Product Expansion (3OTPE) that did similar expansion as Higher-Order Singular Value Decomposition (HOSVD). In this paper we provide a computation algorithm to improve our previous method, in which SVD is applied to the matrix that constituted by the contraction of original tensor data and one of the expansion vector obtained. The residual of the improved method is smaller than the previous method, truncating the expanding tensor products to the same number of terms. Moreover, the residual is smaller than HOSVD when applying to color image data. It is able to be confirmed that the computing time of improved method is the same as the previous method and considerably better than HOSVD.

A New Technique for Solar Activity Forecasting Using Recurrent Elman Networks

In this paper we present an efficient approach for the prediction of two sunspot-related time series, namely the Yearly Sunspot Number and the IR5 Index, that are commonly used for monitoring solar activity. The method is based on exploiting partially recurrent Elman networks and it can be divided into three main steps: the first one consists in a “de-rectification" of the time series under study in order to obtain a new time series whose appearance, similar to a sum of sinusoids, can be modelled by our neural networks much better than the original dataset. After that, we normalize the derectified data so that they have zero mean and unity standard deviation and, finally, train an Elman network with only one input, a recurrent hidden layer and one output using a back-propagation algorithm with variable learning rate and momentum. The achieved results have shown the efficiency of this approach that, although very simple, can perform better than most of the existing solar activity forecasting methods.

Parallelization and Optimization of SIFT Feature Extraction on Cluster System

Scale Invariant Feature Transform (SIFT) has been widely applied, but extracting SIFT feature is complicated and time-consuming. In this paper, to meet the demand of the real-time applications, SIFT is parallelized and optimized on cluster system, which is named pSIFT. Redundancy storage and communication are used for boundary data to improve the performance, and before representation of feature descriptor, data reallocation is adopted to keep load balance in pSIFT. Experimental results show that pSIFT achieves good speedup and scalability.

The Influences of Marketing Mix on Customer Purchasing Behavior at Chatuchak Plaza Market

The objective of this research was to study the influence of marketing mix on customers purchasing behavior. A total of 397 respondents were collected from customers who were the patronages of the Chatuchak Plaza market. A questionnaire was utilized as a tool to collect data. Statistics utilized in this research included frequency, percentage, mean, standard deviation, and multiple regression analysis. Data were analyzed by using Statistical Package for the Social Sciences. The findings revealed that the majority of respondents were male with the age between 25-34 years old, hold undergraduate degree, married and stay together. The average income of respondents was between 10,001-20,000 baht. In terms of occupation, the majority worked for private companies. The research analysis disclosed that there were three variables of marketing mix which included price (X2), place (X3), and product (X1) which had an influence on the frequency of customer purchasing. These three variables can predict a purchase about 30 percent of the time by using the equation; Y1 = 6.851 + .921(X2) + .949(X3) + .591(X1). It also found that in terms of marketing mixed, there were two variables had an influence on the amount of customer purchasing which were physical characteristic (X6), and the process (X7). These two variables are 17 percent predictive of a purchasing by using the equation: Y2 = 2276.88 + 2980.97(X6) + 2188.09(X7).

An Automatic Tool for Checking Consistency between Data Flow Diagrams (DFDs)

System development life cycle (SDLC) is a process uses during the development of any system. SDLC consists of four main phases: analysis, design, implement and testing. During analysis phase, context diagram and data flow diagrams are used to produce the process model of a system. A consistency of the context diagram to lower-level data flow diagrams is very important in smoothing up developing process of a system. However, manual consistency check from context diagram to lower-level data flow diagrams by using a checklist is time-consuming process. At the same time, the limitation of human ability to validate the errors is one of the factors that influence the correctness and balancing of the diagrams. This paper presents a tool that automates the consistency check between Data Flow Diagrams (DFDs) based on the rules of DFDs. The tool serves two purposes: as an editor to draw the diagrams and as a checker to check the correctness of the diagrams drawn. The consistency check from context diagram to lower-level data flow diagrams is embedded inside the tool to overcome the manual checking problem.

Conceptual Frameworks of Carbon Credit Registry System for Thailand

This research explores on the development of the structure of Carbon Credit Registry System those accords to the need of future events in Thailand. This research also explores the big picture of every connected system by referring to the design of each system, the Data Flow Diagram, and the design in term of the system-s data using DES standard. The purpose of this paper is to show how to design the model of each system. Furthermore, this paper can serve as guideline for designing an appropriate Carbon Credit Registry System.

EZW Coding System with Artificial Neural Networks

Image compression plays a vital role in today-s communication. The limitation in allocated bandwidth leads to slower communication. To exchange the rate of transmission in the limited bandwidth the Image data must be compressed before transmission. Basically there are two types of compressions, 1) LOSSY compression and 2) LOSSLESS compression. Lossy compression though gives more compression compared to lossless compression; the accuracy in retrievation is less in case of lossy compression as compared to lossless compression. JPEG, JPEG2000 image compression system follows huffman coding for image compression. JPEG 2000 coding system use wavelet transform, which decompose the image into different levels, where the coefficient in each sub band are uncorrelated from coefficient of other sub bands. Embedded Zero tree wavelet (EZW) coding exploits the multi-resolution properties of the wavelet transform to give a computationally simple algorithm with better performance compared to existing wavelet transforms. For further improvement of compression applications other coding methods were recently been suggested. An ANN base approach is one such method. Artificial Neural Network has been applied to many problems in image processing and has demonstrated their superiority over classical methods when dealing with noisy or incomplete data for image compression applications. The performance analysis of different images is proposed with an analysis of EZW coding system with Error Backpropagation algorithm. The implementation and analysis shows approximately 30% more accuracy in retrieved image compare to the existing EZW coding system.

Dynamic Time Warping in Gait Classificationof Motion Capture Data

The method of gait identification based on the nearest neighbor classification technique with motion similarity assessment by the dynamic time warping is proposed. The model based kinematic motion data, represented by the joints rotations coded by Euler angles and unit quaternions is used. The different pose distance functions in Euler angles and quaternion spaces are considered. To evaluate individual features of the subsequent joints movements during gait cycle, joint selection is carried out. To examine proposed approach database containing 353 gaits of 25 humans collected in motion capture laboratory is used. The obtained results are promising. The classifications, which takes into consideration all joints has accuracy over 91%. Only analysis of movements of hip joints allows to correctly identify gaits with almost 80% precision.