Efficient Implementation of Serial and Parallel Support Vector Machine Training with a Multi-Parameter Kernel for Large-Scale Data Mining

This work deals with aspects of support vector learning for large-scale data mining tasks. Based on a decomposition algorithm that can be run in serial and parallel mode we introduce a data transformation that allows for the usage of an expensive generalized kernel without additional costs. In order to speed up the decomposition algorithm we analyze the problem of working set selection for large data sets and analyze the influence of the working set sizes onto the scalability of the parallel decomposition scheme. Our modifications and settings lead to improvement of support vector learning performance and thus allow using extensive parameter search methods to optimize classification accuracy.

Cluster Algorithm for Genetic Diversity

With the hardware technology advancing, the cost of storing is decreasing. Thus there is an urgent need for new techniques and tools that can intelligently and automatically assist us in transferring this data into useful knowledge. Different techniques of data mining are developed which are helpful for handling these large size databases [7]. Data mining is also finding its role in the field of biotechnology. Pedigree means the associated ancestry of a crop variety. Genetic diversity is the variation in the genetic composition of individuals within or among species. Genetic diversity depends upon the pedigree information of the varieties. Parents at lower hierarchic levels have more weightage for predicting genetic diversity as compared to the upper hierarchic levels. The weightage decreases as the level increases. For crossbreeding, the two varieties should be more and more genetically diverse so as to incorporate the useful characters of the two varieties in the newly developed variety. This paper discusses the searching and analyzing of different possible pairs of varieties selected on the basis of morphological characters, Climatic conditions and Nutrients so as to obtain the most optimal pair that can produce the required crossbreed variety. An algorithm was developed to determine the genetic diversity between the selected wheat varieties. Cluster analysis technique is used for retrieving the results.

Improvement of Gas Turbine Performance Test in Combine Cycle

One of the important applications of gas turbines is their utilization for heat recovery steam generator in combine-cycle technology. Exhaust flow and energy are two key parameters for determining heat recovery steam generator performance which are mainly determined by the main gas turbine components performance data. For this reason a method was developed for determining the exhaust energy in the new edition of ASME PTC22. The result of this investigation shows that the method of standard has considerable error. Therefore in this paper a new method is presented for modifying of the performance calculation. The modified method is based on exhaust gas constituent analysis and combustion calculations. The case study presented here by two kind of General Electric gas turbine design data for validation of methodologies. The result shows that the modified method is more precise than the ASME PTC22 method. The exhaust flow calculation deviation from design data is 1.5-2 % by ASME PTC22 method so that the deviation regarding with modified method is 0.3-0.5%. Based on precision of analyzer instruments, the method can be suitable alternative for gas turbine standard performance test. In advance two methods are proposed based on known and unknown fuel in modified method procedure. The result of this paper shows that the difference between the two methods is below than %0.02. In according to reasonable esult of the second procedure (unknown fuel composition), the method can be applied to performance evaluation of gas turbine, so that the measuring cost and data gathering should be reduced.

Evaluation of Geosynthetic Forces in GRSRW under Dynamic Condition

Geosynthetics have proved to be suitable for reinforced soil retaining walls. Based on the increasing uses of geosynthetic reinforced soil systems in the regions, which bear frequent earthquakes, the study of dynamic behavior of structures seems necessary. Determining the reinforcement forces is; therefore, one of the most important and main points of discussions in designing retaining walls, by which we prevent from conservative planning. Thus, this paper intended to investigate the effects of such parameters as wall height, acceleration type, vertical spacing of reinforcement, type of reinforcement and soil type on forces and deformation through numerical modeling of the geosynthetic reinforced soil retaining walls (GRSRW) under dynamic loading with finite difference method by using FLAC. The findings indicate rather positive results with each parameter.

Iris Recognition Based On the Low Order Norms of Gradient Components

Iris pattern is an important biological feature of human body; it becomes very hot topic in both research and practical applications. In this paper, an algorithm is proposed for iris recognition and a simple, efficient and fast method is introduced to extract a set of discriminatory features using first order gradient operator applied on grayscale images. The gradient based features are robust, up to certain extents, against the variations may occur in contrast or brightness of iris image samples; the variations are mostly occur due lightening differences and camera changes. At first, the iris region is located, after that it is remapped to a rectangular area of size 360x60 pixels. Also, a new method is proposed for detecting eyelash and eyelid points; it depends on making image statistical analysis, to mark the eyelash and eyelid as a noise points. In order to cover the features localization (variation), the rectangular iris image is partitioned into N overlapped sub-images (blocks); then from each block a set of different average directional gradient densities values is calculated to be used as texture features vector. The applied gradient operators are taken along the horizontal, vertical and diagonal directions. The low order norms of gradient components were used to establish the feature vector. Euclidean distance based classifier was used as a matching metric for determining the degree of similarity between the features vector extracted from the tested iris image and template features vectors stored in the database. Experimental tests were performed using 2639 iris images from CASIA V4-Interival database, the attained recognition accuracy has reached up to 99.92%.

Establishing of Education Strategy in New Technological Environments with using Student Feedback

According to the new developments in the field of information and communication technologies, the necessity arises for active use of these new technologies in education. It is clear that the integration of technology in education system will be different for primary-higher education or traditional- distance education. In this study, the subject of the integration of technology for distance education was discussed. The subject was taken from the viewpoint of students. With using the information of student feedback about education program in which new technological medias are used, how can survey variables can be separated into the factors as positive, negative and supporter and how can be redesigned education strategy of the higher education associations with the examining the variables of each determinated factor is explained. The paper concludes with the recommendations about the necessitity of working as a group of different area experts and using of numerical methods in establishing of education strategy to be successful.

The Minimum PAPR Code for OFDM Systems

In this paper, a block code to minimize the peak-toaverage power ratio (PAPR) of orthogonal frequency division multiplexing (OFDM) signals is proposed. It is shown that cyclic shift and codeword inversion cause not change to peak envelope power. The encoding rule for the proposed code comprises of searching for a seed codeword, shifting the register elements, and determining codeword inversion, eliminating the look-up table for one-to-one correspondence between the source and the coded data. Simulation results show that OFDM systems with the proposed code always have the minimum PAPR.

Succesful Companies- Immunization to Global Economic Crisis: Understanding Strategic Role of NGOs

One of the most important secrets of succesful companies is the fact that cooperation with NGOs will create a good reputation for them so that they can be immunized to economic crisis. The performance of the most admired companies in the world based on the ratings of Forbes and Fortune show us that most of these firms also have close relationships with their NGOs. Today, if companies do something wrong this information spreads very quickly to do the society. If people do not like the activities of a company, it can find itself in public relations nightmare that can threaten its repuation. Since the cost of communication has dropped dramatically due to the vast use of internet, the increase in communication among stakeholders via internet makes companies more visible. These multiple and interdependent interactions among the network of stakeholders is called as the network relationships. NGOs play the role of catalyst among the stakeholders of a firm to enhance the awareness. Succesful firms are aware of this fact that NGOs have a central role in today-s business world. Firms are also aware of the fact that they can enhance their corporate reputation via cooperation with the NGOs. This fact will be illustrated in this paper by examining some of the actions of the most succesful companies in terms of their cooperations with the NGOs.

Maximum Water Hammer Sensitivity Analysis

Pressure waves and Water Hammer occur in a pumping system when valves are closed or opened suddenly or in the case of sudden failure of pumps. Determination of maximum water hammer is considered one of the most important technical and economical items of which engineers and designers of pumping stations and conveyance pipelines should take care. Hammer Software is a recent application used to simulate water hammer. The present study focuses on determining significance of each input parameter of the application relative to the maximum amount of water hammer estimated by the software. The study determines estimated maximum water hammer variations due to variations of input parameters including water temperature, pipe type, thickness and diameter, electromotor rpm and power, and moment of inertia of electromotor and pump. In our study, Kuhrang Pumping Station was modeled using WaterGEMS Software. The pumping station is characterized by total discharge of 200 liters per second, dynamic height of 194 meters and 1.5 kilometers of steel conveyance pipeline and transports water to Cheshme Morvarid for farmland irrigation. The model was run in steady hydraulic condition and transferred to Hammer Software. Then, the model was run in several unsteady hydraulic conditions and sensitivity of maximum water hammer to each input parameter was calculated. It is shown that parameters to which maximum water hammer is most sensitive are moment of inertia of pump and electromotor, diameter, type and thickness of pipe and water temperature, respectively.

Application of Kansei Engineering and Association Rules Mining in Product Design

The Kansei engineering is a technology which converts human feelings into quantitative terms and helps designers develop new products that meet customers- expectation. Standard Kansei engineering procedure involves finding relationships between human feelings and design elements of which many researchers have found forward and backward relationship through various soft computing techniques. In this paper, we proposed the framework of Kansei engineering linking relationship not only between human feelings and design elements, but also the whole part of product, by constructing association rules. In this experiment, we obtain input from emotion score that subjects rate when they see the whole part of the product by applying semantic differentials. Then, association rules are constructed to discover the combination of design element which affects the human feeling. The results of our experiment suggest the pattern of relationship of design elements according to human feelings which can be derived from the whole part of product.

Definition in Law: Transgender Identities and Marriage

This paper looks at transgender identities and the law in the context of marriage. It particularly focuses on the role of language and definition in classifying transgendered individuals into a legal category. Two lines of cases in transgender jurisprudence are examined. The former cases decided the definition of 'man' and 'woman' on the basis of biological criteria while the latter cases held that biological factors should not be the sole criterion for defining a man or a woman. Three categories were found to classify transgender people, namely male, female and "monstrous". Since transgender people challenge the core gender distinction that the law stresses, they are often regarded as problematic and monstrous which caused them to be subjected to severe legal consequences. This paper discusses these issues by analyzing and comparing different cases in transgender jurisprudence as well as examining how these issues play out in contemporary Hong Kong.

How Social Network Structure Affects the Dynamics of Evolution of Cooperation?

The existence of many biological systems, especially human societies, is based on cooperative behavior [1, 2]. If natural selection favors selfish individuals, then what mechanism is at work that we see so many cooperative behaviors? One answer is the effect of network structure. On a graph, cooperators can evolve by forming network bunches [2, 3, 4]. In a research, Ohtsuki et al used the idea of iterated prisoners- dilemma on a graph to model an evolutionary game. They showed that the average number of neighbors plays an important role in determining whether cooperation is the ESS of the system or not [3]. In this paper, we are going to study the dynamics of evolution of cooperation in a social network. We show that during evolution, the ratio of cooperators among individuals with fewer neighbors to cooperators among other individuals is greater than unity. The extent to which the fitness function depends on the payoff of the game determines this ratio.

Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering

Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.

A Computer Model of Language Acquisition – Syllable Learning – Based on Hebbian Cell Assemblies and Reinforcement Learning

Investigating language acquisition is one of the most challenging problems in the area of studying language. Syllable learning as a level of language acquisition has a considerable significance since it plays an important role in language acquisition. Because of impossibility of studying language acquisition directly with children, especially in its developmental phases, computer models will be useful in examining language acquisition. In this paper a computer model of early language learning for syllable learning is proposed. It is guided by a conceptual model of syllable learning which is named Directions Into Velocities of Articulators model (DIVA). The computer model uses simple associational and reinforcement learning rules within neural network architecture which are inspired by neuroscience. Our simulation results verify the ability of the proposed computer model in producing phonemes during babbling and early speech. Also, it provides a framework for examining the neural basis of language learning and communication disorders.

Signed Approach for Mining Web Content Outliers

The emergence of the Internet has brewed the revolution of information storage and retrieval. As most of the data in the web is unstructured, and contains a mix of text, video, audio etc, there is a need to mine information to cater to the specific needs of the users without loss of important hidden information. Thus developing user friendly and automated tools for providing relevant information quickly becomes a major challenge in web mining research. Most of the existing web mining algorithms have concentrated on finding frequent patterns while neglecting the less frequent ones that are likely to contain outlying data such as noise, irrelevant and redundant data. This paper mainly focuses on Signed approach and full word matching on the organized domain dictionary for mining web content outliers. This Signed approach gives the relevant web documents as well as outlying web documents. As the dictionary is organized based on the number of characters in a word, searching and retrieval of documents takes less time and less space.

BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis

Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing step, also when noise is present, classical association rules miners discover multiple small fragments of the true bicluster, but miss the true bicluster itself. This paper formally presents a generalized noise tolerant bicluster model, termed as μBicluster. An iterative algorithm termed as BIDENS based on the proposed model is introduced that can discover a set of k possibly overlapping biclusters simultaneously. Our model uses a more flexible method to partition the dimensions to preserve meaningful and significant biclusters. The proposed algorithm allows discovering biclusters that hard to be discovered by BIMODULE. Experimental study on yeast, human gene expression data and several artificial datasets shows that our algorithm offers substantial improvements over several previously proposed biclustering algorithms.

Addressing Data Security in the Cloud

The development of information and communication technology, the increased use of the internet, as well as the effects of the recession within the last years, have lead to the increased use of cloud computing based solutions, also called on-demand solutions. These solutions offer a large number of benefits to organizations as well as challenges and risks, mainly determined by data visualization in different geographic locations on the internet. As far as the specific risks of cloud environment are concerned, data security is still considered a peak barrier in adopting cloud computing. The present study offers an approach upon ensuring the security of cloud data, oriented towards the whole data life cycle. The final part of the study focuses on the assessment of data security in the cloud, this representing the bases in determining the potential losses and the premise for subsequent improvements and continuous learning.

A Genetic Algorithm for Clustering on Image Data

Clustering is the process of subdividing an input data set into a desired number of subgroups so that members of the same subgroup are similar and members of different subgroups have diverse properties. Many heuristic algorithms have been applied to the clustering problem, which is known to be NP Hard. Genetic algorithms have been used in a wide variety of fields to perform clustering, however, the technique normally has a long running time in terms of input set size. This paper proposes an efficient genetic algorithm for clustering on very large data sets, especially on image data sets. The genetic algorithm uses the most time efficient techniques along with preprocessing of the input data set. We test our algorithm on both artificial and real image data sets, both of which are of large size. The experimental results show that our algorithm outperforms the k-means algorithm in terms of running time as well as the quality of the clustering.

Simulating the Dynamics of Distribution of Hazardous Substances Emitted by Motor Engines in a Residential Quarter

This article is dedicated to development of mathematical models for determining the dynamics of concentration of hazardous substances in urban turbulent atmosphere. Development of the mathematical models implied taking into account the time-space variability of the fields of meteorological items and such turbulent atmosphere data as vortex nature, nonlinear nature, dissipativity and diffusivity. Knowing the turbulent airflow velocity is not assumed when developing the model. However, a simplified model implies that the turbulent and molecular diffusion ratio is a piecewise constant function that changes depending on vertical distance from the earth surface. Thereby an important assumption of vertical stratification of urban air due to atmospheric accumulation of hazardous substances emitted by motor vehicles is introduced into the mathematical model. The suggested simplified non-linear mathematical model of determining the sought exhaust concentration at a priori unknown turbulent flow velocity through non-degenerate transformation is reduced to the model which is subsequently solved analytically.

Recursive Algorithms for Image Segmentation Based on a Discriminant Criterion

In this study, a new criterion for determining the number of classes an image should be segmented is proposed. This criterion is based on discriminant analysis for measuring the separability among the segmented classes of pixels. Based on the new discriminant criterion, two algorithms for recursively segmenting the image into determined number of classes are proposed. The proposed methods can automatically and correctly segment objects with various illuminations into separated images for further processing. Experiments on the extraction of text strings from complex document images demonstrate the effectiveness of the proposed methods.1