Moving Data Mining Tools toward a Business Intelligence System

Data mining (DM) is the process of finding and extracting frequent patterns that can describe the data, or predict unknown or future values. These goals are achieved by using various learning algorithms. Each algorithm may produce a mining result completely different from the others. Some algorithms may find millions of patterns. It is thus the difficult job for data analysts to select appropriate models and interpret the discovered knowledge. In this paper, we describe a framework of an intelligent and complete data mining system called SUT-Miner. Our system is comprised of a full complement of major DM algorithms, pre-DM and post-DM functionalities. It is the post-DM packages that ease the DM deployment for business intelligence applications.

Statistical Estimation of Spring-back Degree Using Texture Database

Using a texture database, a statistical estimation of spring-back was conducted in this study on the basis of statistical analysis. Both spring-back in bending deformation and experimental data related to the crystal orientation show significant dispersion. Therefore, a probabilistic statistical approach was established for the proper quantification of these values. Correlation was examined among the parameters F(x) of spring-back, F(x) of the buildup fraction to three orientations after 92° bending, and F(x) at an as-received part on the basis of the three-parameter Weibull distribution. Consequent spring-back estimation using a texture database yielded excellent estimates compared with experimental values.

Cross-cultural Analysis of the Strategy of Tolerance in the Republic of Kazakhstan

The modern Kazakh society is characterized by strengthen cross-cultural communication, the emergence of new powerful subcultures, accelerated change in social systems and values. The socio-political reforms in all fields have changed the quality of social relationships and spiritual life.Cross-cultural approach involves the analysis of different types of behavior and communication, including the manifestation of the conflict, and the formation of marginal destructive stereotypes.

Selective Sulfidation of Copper, Zinc and Nickelin Plating Wastewater using Calcium Sulfide

The present work is concerned with sulfidation of Cu, Zn and Ni containing plating wastewater with CaS. The sulfidation experiments were carried out at a room temperature by adding solid CaS to simulated metal solution containing either single-metal of Ni, Zn and Cu, or Ni-Zn-Cu mixture. At first, the experiments were conducted without pH adjustment and it was found that the complete sulfidation of Zn and Ni was achieved at an equimolar ratio of CaS to a particular metal. However, in the case of Cu, a complete copper sulfidation was achieved at CaS to Cu molar ratio of about 2. In the case of the selective sulfidation, a simulated plating solution containing Cu, Zn and Ni at the concentration of 100 mg/dm3 was treated with CaS under various pH conditions. As a result, selective precipitation of metal sulfides was achieved by a sulfidation treatment at different pH values. Further, the precipitation agents of NaOH, Na2S and CaS were compared in terms of the average specific filtration resistance and compressibility coefficients of metal sulfide slurry. Consequently, based on the lowest filtration parameters of the produced metal sulfides, it was concluded that CaS was the most effective precipitation agent for separation and recovery of Cu, Zn and Ni.

Multi-threshold Approach for License Plate Recognition System

The objective of this paper is to propose an adaptive multi threshold for image segmentation precisely in object detection. Due to the different types of license plates being used, the requirement of an automatic LPR is rather different for each country. The proposed technique is applied on Malaysian LPR application. It is based on Multi Layer Perceptron trained by back propagation. The proposed adaptive threshold is introduced to find the optimum threshold values. The technique relies on the peak value from the graph of the number object versus specific range of threshold values. The proposed approach has improved the overall performance compared to current optimal threshold techniques. Further improvement on this method is in progress to accommodate real time system specification.

Robust Face Recognition using AAM and Gabor Features

In this paper, we propose a face recognition algorithm using AAM and Gabor features. Gabor feature vectors which are well known to be robust with respect to small variations of shape, scaling, rotation, distortion, illumination and poses in images are popularly employed for feature vectors for many object detection and recognition algorithms. EBGM, which is prominent among face recognition algorithms employing Gabor feature vectors, requires localization of facial feature points where Gabor feature vectors are extracted. However, localization method employed in EBGM is based on Gabor jet similarity and is sensitive to initial values. Wrong localization of facial feature points affects face recognition rate. AAM is known to be successfully applied to localization of facial feature points. In this paper, we devise a facial feature point localization method which first roughly estimate facial feature points using AAM and refine facial feature points using Gabor jet similarity-based facial feature localization method with initial points set by the rough facial feature points obtained from AAM, and propose a face recognition algorithm using the devised localization method for facial feature localization and Gabor feature vectors. It is observed through experiments that such a cascaded localization method based on both AAM and Gabor jet similarity is more robust than the localization method based on only Gabor jet similarity. Also, it is shown that the proposed face recognition algorithm using this devised localization method and Gabor feature vectors performs better than the conventional face recognition algorithm using Gabor jet similarity-based localization method and Gabor feature vectors like EBGM.

A Systematic Construction of Instability Bounds in LIS Networks

In this work, we study the impact of dynamically changing link slowdowns on the stability properties of packetswitched networks under the Adversarial Queueing Theory framework. Especially, we consider the Adversarial, Quasi-Static Slowdown Queueing Theory model, where each link slowdown may take on values in the two-valued set of integers {1, D} with D > 1 which remain fixed for a long time, under a (w, p)-adversary. In this framework, we present an innovative systematic construction for the estimation of adversarial injection rate lower bounds, which, if exceeded, cause instability in networks that use the LIS (Longest-in- System) protocol for contention-resolution. In addition, we show that a network that uses the LIS protocol for contention-resolution may result in dropping its instability bound at injection rates p > 0 when the network size and the high slowdown D take large values. This is the best ever known instability lower bound for LIS networks.

Arriving at an Optimum Value of Tolerance Factor for Compressing Medical Images

Medical imaging uses the advantage of digital technology in imaging and teleradiology. In teleradiology systems large amount of data is acquired, stored and transmitted. A major technology that may help to solve the problems associated with the massive data storage and data transfer capacity is data compression and decompression. There are many methods of image compression available. They are classified as lossless and lossy compression methods. In lossy compression method the decompressed image contains some distortion. Fractal image compression (FIC) is a lossy compression method. In fractal image compression an image is coded as a set of contractive transformations in a complete metric space. The set of contractive transformations is guaranteed to produce an approximation to the original image. In this paper FIC is achieved by PIFS using quadtree partitioning. PIFS is applied on different images like , Ultrasound, CT Scan, Angiogram, X-ray, Mammograms. In each modality approximately twenty images are considered and the average values of compression ratio and PSNR values are arrived. In this method of fractal encoding, the parameter, tolerance factor Tmax, is varied from 1 to 10, keeping the other standard parameters constant. For all modalities of images the compression ratio and Peak Signal to Noise Ratio (PSNR) are computed and studied. The quality of the decompressed image is arrived by PSNR values. From the results it is observed that the compression ratio increases with the tolerance factor and mammogram has the highest compression ratio. The quality of the image is not degraded upto an optimum value of tolerance factor, Tmax, equal to 8, because of the properties of fractal compression.

Legal Education as Forming Factor of Legal Culture in Kazakhstan Modern Society

Forming a legal culture among citizens is a complicated and lengthy process, influencing all spheres of social life. It includes promoting justice, learning rights and duties, the introduction of juridical norms and knowledge, and also a process of developing a system of legal acts and constitutional norms. Currently, the evaluative and emotional influence of attempts to establish a legal culture among the citizens of Kazakhstan is limited by real legal practice. As a result, the values essential to a sound civil society are absent from the consciousness of the Kazakh people who are thus, in turn, not able to develop respect for these values. One of the disadvantages of the modern Kazakh educational system is a tendency to underrate the actual forces shaping the worldview of Kazakh youths. The mass-media, which are going through a personnel crisis, cannot provide society with the legal and political information necessary to form the sort of legal culture required for a true civil society.

Prediction of Dissolved Oxygen in Rivers Using a Wang-Mendel Method – Case Study of Au Sable River

Amount of dissolve oxygen in a river has a great direct affect on aquatic macroinvertebrates and this would influence on the region ecosystem indirectly. In this paper it is tried to predict dissolved oxygen in rivers by employing an easy Fuzzy Logic Modeling, Wang Mendel method. This model just uses previous records to estimate upcoming values. For this purpose daily and hourly records of eight stations in Au Sable watershed in Michigan, United States are employed for 12 years and 50 days period respectively. Calculations indicate that for long period prediction it is better to increase input intervals. But for filling missed data it is advisable to decrease the interval. Increasing partitioning of input and output features influence a little on accuracy but make the model too time consuming. Increment in number of input data also act like number of partitioning. Large amount of train data does not modify accuracy essentially, so, an optimum training length should be selected.

MC and IC – What Is the Relationship?

MC (Management Control)& IC (Internal Control) – what is the relationship? (an empirical study into the definitions between MC and IC) based on the wider considerations of Internal Control and Management Control terms, attention is focused not only on the financial aspects but also more on the soft aspects of the business, such as culture, behaviour, standards and values. The limited considerations of Management Control are focused mainly in the hard, financial aspects of business operation. The definitions of Management Control and Internal Control are often used interchangeably and the results of this empirical study reveal that Management Control is part of Internal Control, there is no causal link between the two concepts. Based on the interpretation of the respondents, the term Management Control has moved from a broad term to a more limited term with the soft aspects of the influencing of behaviour, performance measurements, incentives and culture. This paper is an exploratory study based on qualitative research and on a qualitative matrix method analysis of the thematic definition of the terms Management Control and Internal Control.

Approximate Range-Sum Queries over Data Cubes Using Cosine Transform

In this research, we propose to use the discrete cosine transform to approximate the cumulative distributions of data cube cells- values. The cosine transform is known to have a good energy compaction property and thus can approximate data distribution functions easily with small number of coefficients. The derived estimator is accurate and easy to update. We perform experiments to compare its performance with a well-known technique - the (Haar) wavelet. The experimental results show that the cosine transform performs much better than the wavelet in estimation accuracy, speed, space efficiency, and update easiness.

Mining Genes Relations in Microarray Data Combined with Ontology in Colon Cancer Automated Diagnosis System

MATCH project [1] entitle the development of an automatic diagnosis system that aims to support treatment of colon cancer diseases by discovering mutations that occurs to tumour suppressor genes (TSGs) and contributes to the development of cancerous tumours. The constitution of the system is based on a) colon cancer clinical data and b) biological information that will be derived by data mining techniques from genomic and proteomic sources The core mining module will consist of the popular, well tested hybrid feature extraction methods, and new combined algorithms, designed especially for the project. Elements of rough sets, evolutionary computing, cluster analysis, self-organization maps and association rules will be used to discover the annotations between genes, and their influence on tumours [2]-[11]. The methods used to process the data have to address their high complexity, potential inconsistency and problems of dealing with the missing values. They must integrate all the useful information necessary to solve the expert's question. For this purpose, the system has to learn from data, or be able to interactively specify by a domain specialist, the part of the knowledge structure it needs to answer a given query. The program should also take into account the importance/rank of the particular parts of data it analyses, and adjusts the used algorithms accordingly.

Perturbed-Chain Statistical Association Fluid Theory (PC-SAFT) Parameters for Propane, Ethylene, and Hydrogen under Supercritical Conditions

Perturbed-Chain Statistical Association Fluid Theory (PC-SAFT) equation of state (EOS) is a modified SAFT EOS with three pure component specific parameters: segment number (m), diameter (σ) and energy (ε). These PC-SAFT parameters need to be determined for each component under the conditions of interest by fitting experimental data, such as vapor pressure, density or heat capacity. PC-SAFT parameters for propane, ethylene and hydrogen in supercritical region were successfully estimated by fitting experimental density data available in literature. The regressed PCSAFT parameters were compared with the literature values by means of estimating pure component density and calculating average absolute deviation between the estimated and experimental density values. PC-SAFT parameters available in literature especially for ethylene and hydrogen estimated density in supercritical region reasonably well. However, the regressed PC-SAFT parameters performed better in supercritical region than the PC-SAFT parameters from literature.

AC Signals Estimation from Irregular Samples

The paper deals with the estimation of amplitude and phase of an analogue multi-harmonic band-limited signal from irregularly spaced sampling values. To this end, assuming the signal fundamental frequency is known in advance (i.e., estimated at an independent stage), a complexity-reduced algorithm for signal reconstruction in time domain is proposed. The reduction in complexity is achieved owing to completely new analytical and summarized expressions that enable a quick estimation at a low numerical error. The proposed algorithm for the calculation of the unknown parameters requires O((2M+1)2) flops, while the straightforward solution of the obtained equations takes O((2M+1)3) flops (M is the number of the harmonic components). It is applied in signal reconstruction, spectral estimation, system identification, as well as in other important signal processing problems. The proposed method of processing can be used for precise RMS measurements (for power and energy) of a periodic signal based on the presented signal reconstruction. The paper investigates the errors related to the signal parameter estimation, and there is a computer simulation that demonstrates the accuracy of these algorithms.

2D Graphical Analysis of Wastewater Influent Capacity Time Series

The extraction of meaningful information from image could be an alternative method for time series analysis. In this paper, we propose a graphical analysis of time series grouped into table with adjusted colour scale for numerical values. The advantages of this method are also discussed. The proposed method is easy to understand and is flexible to implement the standard methods of pattern recognition and verification, especially for noisy environmental data.

An Ising-based Model for the Spread of Infection

A zero-field ferromagnetic Ising model is utilized to simulate the propagation of infection in a population that assumes a square lattice structure. The rate of infection increases with temperature. The disease spreads faster among individuals with low J values. Such effect, however, diminishes at higher temperatures.

An Efficient Architecture for Interleaved Modular Multiplication

Modular multiplication is the basic operation in most public key cryptosystems, such as RSA, DSA, ECC, and DH key exchange. Unfortunately, very large operands (in order of 1024 or 2048 bits) must be used to provide sufficient security strength. The use of such big numbers dramatically slows down the whole cipher system, especially when running on embedded processors. So far, customized hardware accelerators - developed on FPGAs or ASICs - were the best choice for accelerating modular multiplication in embedded environments. On the other hand, many algorithms have been developed to speed up such operations. Examples are the Montgomery modular multiplication and the interleaved modular multiplication algorithms. Combining both customized hardware with an efficient algorithm is expected to provide a much faster cipher system. This paper introduces an enhanced architecture for computing the modular multiplication of two large numbers X and Y modulo a given modulus M. The proposed design is compared with three previous architectures depending on carry save adders and look up tables. Look up tables should be loaded with a set of pre-computed values. Our proposed architecture uses the same carry save addition, but replaces both look up tables and pre-computations with an enhanced version of sign detection techniques. The proposed architecture supports higher frequencies than other architectures. It also has a better overall absolute time for a single operation.

A Text Clustering System based on k-means Type Subspace Clustering and Ontology

This paper presents a text clustering system developed based on a k-means type subspace clustering algorithm to cluster large, high dimensional and sparse text data. In this algorithm, a new step is added in the k-means clustering process to automatically calculate the weights of keywords in each cluster so that the important words of a cluster can be identified by the weight values. For understanding and interpretation of clustering results, a few keywords that can best represent the semantic topic are extracted from each cluster. Two methods are used to extract the representative words. The candidate words are first selected according to their weights calculated by our new algorithm. Then, the candidates are fed to the WordNet to identify the set of noun words and consolidate the synonymy and hyponymy words. Experimental results have shown that the clustering algorithm is superior to the other subspace clustering algorithms, such as PROCLUS and HARP and kmeans type algorithm, e.g., Bisecting-KMeans. Furthermore, the word extraction method is effective in selection of the words to represent the topics of the clusters.

Optimal Current Control of Externally Excited Synchronous Machines in Automotive Traction Drive Applications

The excellent suitability of the externally excited synchronous machine (EESM) in automotive traction drive applications is justified by its high efficiency over the whole operation range and the high availability of materials. Usually, maximum efficiency is obtained by modelling each single loss and minimizing the sum of all losses. As a result, the quality of the optimization highly depends on the precision of the model. Moreover, it requires accurate knowledge of the saturation dependent machine inductances. Therefore, the present contribution proposes a method to minimize the overall losses of a salient pole EESM and its inverter in steady state operation based on measurement data only. Since this method does not require any manufacturer data, it is well suited for an automated measurement data evaluation and inverter parametrization. The field oriented control (FOC) of an EESM provides three current components resp. three degrees of freedom (DOF). An analytic minimization of the copper losses in the stator and the rotor (assuming constant inductances) is performed and serves as a first approximation of how to choose the optimal current reference values. After a numeric offline minimization of the overall losses based on measurement data the results are compared to a control strategy that satisfies cos (ϕ) = 1.