The Robust Clustering with Reduction Dimension

A clustering is process to identify a homogeneous groups of object called as cluster. Clustering is one interesting topic on data mining. A group or class behaves similarly characteristics. This paper discusses a robust clustering process for data images with two reduction dimension approaches; i.e. the two dimensional principal component analysis (2DPCA) and principal component analysis (PCA). A standard approach to overcome this problem is dimension reduction, which transforms a high-dimensional data into a lower-dimensional space with limited loss of information. One of the most common forms of dimensionality reduction is the principal components analysis (PCA). The 2DPCA is often called a variant of principal component (PCA), the image matrices were directly treated as 2D matrices; they do not need to be transformed into a vector so that the covariance matrix of image can be constructed directly using the original image matrices. The decomposed classical covariance matrix is very sensitive to outlying observations. The objective of paper is to compare the performance of robust minimizing vector variance (MVV) in the two dimensional projection PCA (2DPCA) and the PCA for clustering on an arbitrary data image when outliers are hiden in the data set. The simulation aspects of robustness and the illustration of clustering images are discussed in the end of paper

Operational Risk – Scenario Analysis

This paper focuses on operational risk measurement techniques and on economic capital estimation methods. A data sample of operational losses provided by an anonymous Central European bank is analyzed using several approaches. Loss Distribution Approach and scenario analysis method are considered. Custom plausible loss events defined in a particular scenario are merged with the original data sample and their impact on capital estimates and on the financial institution is evaluated. Two main questions are assessed – What is the most appropriate statistical method to measure and model operational loss data distribution? and What is the impact of hypothetical plausible events on the financial institution? The g&h distribution was evaluated to be the most suitable one for operational risk modeling. The method based on the combination of historical loss events modeling and scenario analysis provides reasonable capital estimates and allows for the measurement of the impact of extreme events on banking operations.

Classification of Non Stationary Signals Using Ben Wavelet and Artificial Neural Networks

The automatic classification of non stationary signals is an important practical goal in several domains. An essential classification task is to allocate the incoming signal to a group associated with the kind of physical phenomena producing it. In this paper, we present a modular system composed by three blocs: 1) Representation, 2) Dimensionality reduction and 3) Classification. The originality of our work consists in the use of a new wavelet called "Ben wavelet" in the representation stage. For the dimensionality reduction, we propose a new algorithm based on the random projection and the principal component analysis.

The Necessity of Optimized Management on Surface Water Sources of Zayanderood Basin

One of the efficient factors in comprehensive development of an area is to provide water sources and on the other hand the appropriate management of them. Population growth and nourishment security for such a population necessitate the achievement of constant development besides the reforming of traditional management in order to increase the profit of sources; In this case, the constant exploitation of sources for the next generations will be considered in this program. The achievement of this development without the consideration and possibility of water development will be too difficult. Zayanderood basin with 41500 areas in square kilometers contains 7 sub-basins and 20 units of hydrologic. In this basin area, from the entire environment descending, just a small part will enter into the river currents and the rest will be out of efficient usage by various ways. The most important surface current of this basin is Zayanderood River with 403 kilometers length which is originated from east slopes of Zagros mount and after draining of this basin area it will enter into Gaavkhooni pond. The existence of various sources and consumptions of water in Zayanderood basin, water transfer of the other basin areas into this basin, of course the contradiction between the upper and lower beneficiaries, the existence of worthwhile natural ecosystems such as Gaavkhooni swamp in this basin area and finally, the drought condition and lack of water in this area all necessitate the existence of comprehensive management of water sources in this central basin area of Iran as this method is a kind of management which considers the development and the management of water sources as an equilibrant way to increase the economical and social benefits. In this study, it is tried to survey the network of surface water sources of basin in upper and lower sections; at the most, according to the difficulties and deficiencies of an efficient management of water sources in this basin area, besides the difficulties of water draining and the destructive phenomenon of flood-water, the appropriate guidelines according to the region conditions are presented in order to prevent the deviation of water in upper sections and development of regions in lower sections of Zayanderood dam.

Load Discontinuity in Shock Response and Its Remedies

It has been shown that a load discontinuity at the end of an impulse will result in an extra impulse and hence an extra amplitude distortion if a step-by-step integration method is employed to yield the shock response. In order to overcome this difficulty, three remedies are proposed to reduce the extra amplitude distortion. The first remedy is to solve the momentum equation of motion instead of the force equation of motion in the step-by-step solution of the shock response, where an external momentum is used in the solution of the momentum equation of motion. Since the external momentum is a resultant of the time integration of external force, the problem of load discontinuity will automatically disappear. The second remedy is to perform a single small time step immediately upon termination of the applied impulse while the other time steps can still be conducted by using the time step determined from general considerations. This is because that the extra impulse caused by a load discontinuity at the end of an impulse is almost linearly proportional to the step size. Finally, the third remedy is to use the average value of the two different values at the integration point of the load discontinuity to replace the use of one of them for loading input. The basic motivation of this remedy originates from the concept of no loading input error associated with the integration point of load discontinuity. The feasibility of the three remedies are analytically explained and numerically illustrated.

A Study on Polymer Coated Colour Pigments for Water-Based Ink

The pigments covered by film-forming polymers have opened a prospect to improve the quality of water-based printing inks. In this study such pigments were prepared by the initiated polymerization of styrene and methacrylate derivative monomers in the aqueous pigment dispersions. The formation of polymer films covering pigment cores depends on the polymerization time and the ratio of pigment to monomers. At the time of 4 hours and the ratio of 1/10 almost pigment particles are coated by the polymer. The formed polymer covers of pigments have the average thickness of 5.95 nm. The size increasing percentage of the coated particles after a week is 4.5 %, about fourteen-fold lower than of the original ones. The obtained results indicate that the coated pigments are improved dispersion stability in water medium along with a guarantee for the optical colour.

Dimension Reduction of Microarray Data Based on Local Principal Component

Analysis and visualization of microarraydata is veryassistantfor biologists and clinicians in the field of diagnosis and treatment of patients. It allows Clinicians to better understand the structure of microarray and facilitates understanding gene expression in cells. However, microarray dataset is a complex data set and has thousands of features and a very small number of observations. This very high dimensional data set often contains some noise, non-useful information and a small number of relevant features for disease or genotype. This paper proposes a non-linear dimensionality reduction algorithm Local Principal Component (LPC) which aims to maps high dimensional data to a lower dimensional space. The reduced data represents the most important variables underlying the original data. Experimental results and comparisons are presented to show the quality of the proposed algorithm. Moreover, experiments also show how this algorithm reduces high dimensional data whilst preserving the neighbourhoods of the points in the low dimensional space as in the high dimensional space.

“FGM is with us Everyday“ Women and Girls Speak out about Female Genital Mutilation in the UK

There is inadequate information on the practice of female genital mutilation (FGM) in the UK, and there are often myths and perceptions within communities that influence the effectiveness of prevention programmes. This means it is difficult to address the trends and changes in the practice in the UK. To this end, FORWARD undertook novel and innovative research using the Participatory Ethnographic and Evaluative Research (PEER) method to explore the views of women from Eritrea, Sudan, Somalia and Ethiopia that live in London and Bristol (two UK cities). Women-s views, taken from PEER interviews, reflected reasons for continued practice of FGM: marriageability, the harnessing and control of female sexuality, and upholding traditions from their countries of origin. It was also clear that the main supporters of the practice were believed to be older women within families and communities. Women described the impact FGM was having on their lives as isolating. And although it was clearly considered a private and personal matter, they developed a real sense of connection with their peers within the research process. The women were overwhelmingly positive about combating the practice, although they believed it would probably take a while before it ends completely. They also made concrete recommendations on how to improve support services for women affected by FGM: Training for professionals (particularly in healthcare), increased engagement with, and outreach to, communities, culturally appropriate materials and information made available and accessible to communities, and more consequent implementation of legislation. Finally, the women asked for more empathy and understanding, particularly from health professionals. Rather than presenting FGM as a completely alien and inconceivable practice, it may help for those looking into these women-s lives and working with them to understand the social and economic context in which the practice takes place.

An Investigation into Ozone Concentration at Urban and Rural Monitoring Stations in Malaysia

This study investigated the relationship between urban and rural ozone concentrations and quantified the extent to which ambient rural conditions and the concentrations of other pollutants can be used to predict urban ozone concentrations. The study describes the variations of ozone in weekday and weekends as well as the daily maximum recorded at selected monitoring stations. The results showed that Putrajaya station had the highest concentrations of O3 on weekend due the titration of NO during the weekday. Additionally, Jerantut had the lowest average concentration with a reading value high on Wednesdays. The comparisons of average and maximum concentrations of ozone for the three stations showed that the strongest significant correlation is recorded in Jerantut station with the value R2= 0.769. Ozone concentrations originating from a neighbouring urban site form a better predictor to the urban ozone concentrations than widespread rural ozone at some levels of temporal averaging. It is found that in urban and rural of Malaysian peninsular, the concentration of ozone depends on the concentration of NOx and seasonal meteorological factors. The HYSPLIT Model (the northeast monsoon) showed that the wind direction can also influence the concentration of ozone in the atmosphere in the studied areas.

Fast Wavelet Image Denoising Based on Local Variance and Edge Analysis

The approach based on the wavelet transform has been widely used for image denoising due to its multi-resolution nature, its ability to produce high levels of noise reduction and the low level of distortion introduced. However, by removing noise, high frequency components belonging to edges are also removed, which leads to blurring the signal features. This paper proposes a new method of image noise reduction based on local variance and edge analysis. The analysis is performed by dividing an image into 32 x 32 pixel blocks, and transforming the data into wavelet domain. Fast lifting wavelet spatial-frequency decomposition and reconstruction is developed with the advantages of being computationally efficient and boundary effects minimized. The adaptive thresholding by local variance estimation and edge strength measurement can effectively reduce image noise while preserve the features of the original image corresponding to the boundaries of the objects. Experimental results demonstrate that the method performs well for images contaminated by natural and artificial noise, and is suitable to be adapted for different class of images and type of noises. The proposed algorithm provides a potential solution with parallel computation for real time or embedded system application.

Computer Aided Drug Design and Studies of Antiviral Drug against H3N2 Influenza Virus

The worldwide prevalence of H3N2 influenza virus and its increasing resistance to the existing drugs necessitates for the development of an improved/better targeting anti-influenza drug. H3N2 influenza neuraminidase is one of the two membrane-bound proteins belonging to group-2 neuraminidases. It acts as key player involved in viral pathogenicity and hence, is an important target of anti-influenza drugs. Oseltamivir is one of the potent drugs targeting this neuraminidase. In the present work, we have taken subtype N2 neuraminidase as the receptor and probable analogs of oseltamivir as drug molecules to study the protein-drug interaction in anticipation of finding efficient modified candidate compound. Oseltamivir analogs were made by modifying the functional groups using Marvin Sketch software and were docked using Schrodinger-s Glide. Oseltamivir analog 10 was detected to have significant energy value (16% less compared to Oseltamivir) and could be the probable lead molecule. It infers that some of the modified compounds can interact in a novel manner with increased hydrogen bonding at the active site of neuraminidase and it might be better than the original drug. Further work can be carried out such as enzymatic inhibition studies; synthesis and crystallizing the drug-target complex to analyze the interactions biologically.

Empirical Study of Real Retail Trade Turnover

This paper deals with econometric analysis of real retail trade turnover. It is a part of an extensive scientific research about modern trends in Croatian national economy. At the end of the period of transition economy, Croatia confronts with challenges and problems of high consumption society. In such environment as crucial economic variables: real retail trade turnover, average monthly real wages and household loans are chosen for consequence analysis. For the purpose of complete procedure of multiple econometric analysis data base adjustment has been provided. Namely, it has been necessary to deflate original national statistics data of retail trade turnover using consumer price indices, as well as provide process of seasonally adjustment of its contemporary behavior. In model establishment it has been necessary to involve the overcoming procedure for the autocorrelation and colinearity problems. Moreover, for case of time-series shift a specific appropriate econometric instrument has been applied. It would be emphasize that the whole methodology procedure is based on the real Croatian national economy time-series.

Semantic Web as an Enabling Technology for Better e-Services Addoption

E-services have significantly changed the way of doing business in recent years. We can, however, observe poor use of these services. There is a large gap between supply and actual eservices usage. This is why we started a project to provide an environment that will encourage the use of e-services. We believe that only providing e-service does not automatically mean consumers would use them. This paper shows the origins of our project and its current position. We discuss the decision of using semantic web technologies and their potential to improve e-services usage. We also present current knowledge base and its real-world classification. In the paper, we discuss further work to be done in the project. Current state of the project is promising.

Intelligent Caching in on-demand Routing Protocol for Mobile Adhoc Networks

An on-demand routing protocol for wireless ad hoc networks is one that searches for and attempts to discover a route to some destination node only when a sending node originates a data packet addressed to that node. In order to avoid the need for such a route discovery to be performed before each data packet is sent, such routing protocols must cache routes previously discovered. This paper presents an analysis of the effect of intelligent caching in a non clustered network, using on-demand routing protocols in wireless ad hoc networks. The analysis carried out is based on the Dynamic Source Routing protocol (DSR), which operates entirely on-demand. DSR uses the cache in every node to save the paths that are learnt during route discovery procedure. In this implementation, caching these paths only at intermediate nodes and using the paths from these caches when required is tried. This technique helps in storing more number of routes that are learnt without erasing the entries in the cache, to store a new route that is learnt. The simulation results on DSR have shown that this technique drastically increases the available memory for caching the routes discovered without affecting the performance of the DSR routing protocol in any way, except for a small increase in end to end delay.

Classification of Acoustic Emission Based Partial Discharge in Oil Pressboard Insulation System Using Wavelet Analysis

Insulation used in transformer is mostly oil pressboard insulation. Insulation failure is one of the major causes of catastrophic failure of transformers. It is established that partial discharges (PD) cause insulation degradation and premature failure of insulation. Online monitoring of PDs can reduce the risk of catastrophic failure of transformers. There are different techniques of partial discharge measurement like, electrical, optical, acoustic, opto-acoustic and ultra high frequency (UHF). Being non invasive and non interference prone, acoustic emission technique is advantageous for online PD measurement. Acoustic detection of p.d. is based on the retrieval and analysis of mechanical or pressure signals produced by partial discharges. Partial discharges are classified according to the origin of discharges. Their effects on insulation deterioration are different for different types. This paper reports experimental results and analysis for classification of partial discharges using acoustic emission signal of laboratory simulated partial discharges in oil pressboard insulation system using three different electrode systems. Acoustic emission signal produced by PD are detected by sensors mounted on the experimental tank surface, stored on an oscilloscope and fed to computer for further analysis. The measured AE signals are analyzed using discrete wavelet transform analysis and wavelet packet analysis. Energy distribution in different frequency bands of discrete wavelet decomposed signal and wavelet packet decomposed signal is calculated. These analyses show a distinct feature useful for PD classification. Wavelet packet analysis can sort out any misclassification arising out of DWT in most cases.

Advanced Geolocation of IP Addresses

Tracing and locating the geographical location of users (Geolocation) is used extensively in todays Internet. Whenever we, e.g., request a page from google we are - unless there was a specific configuration made - automatically forwarded to the page with the relevant language and amongst others, dependent on our location identified, specific commercials are presented. Especially within the area of Network Security, Geolocation has a significant impact. Because of the way the Internet works, attacks can be executed from almost everywhere. Therefore, for an attribution, knowledge of the origination of an attack - and thus Geolocation - is mandatory in order to be able to trace back an attacker. In addition, Geolocation can also be used very successfully to increase the security of a network during operation (i.e. before an intrusion actually has taken place). Similar to greylisting in emails, Geolocation allows to (i) correlate attacks detected with new connections and (ii) as a consequence to classify traffic a priori as more suspicious (thus particularly allowing to inspect this traffic in more detail). Although numerous techniques for Geolocation are existing, each strategy is subject to certain restrictions. Following the ideas of Endo et al., this publication tries to overcome these shortcomings with a combined solution of different methods to allow improved and optimized Geolocation. Thus, we present our architecture for improved Geolocation, by designing a new algorithm, which combines several Geolocation techniques to increase the accuracy.

ANN Based Currency Recognition System using Compressed Gray Scale and Application for Sri Lankan Currency Notes - SLCRec

Automatic currency note recognition invariably depends on the currency note characteristics of a particular country and the extraction of features directly affects the recognition ability. Sri Lanka has not been involved in any kind of research or implementation of this kind. The proposed system “SLCRec" comes up with a solution focusing on minimizing false rejection of notes. Sri Lankan currency notes undergo severe changes in image quality in usage. Hence a special linear transformation function is adapted to wipe out noise patterns from backgrounds without affecting the notes- characteristic images and re-appear images of interest. The transformation maps the original gray scale range into a smaller range of 0 to 125. Applying Edge detection after the transformation provided better robustness for noise and fair representation of edges for new and old damaged notes. A three layer back propagation neural network is presented with the number of edges detected in row order of the notes and classification is accepted in four classes of interest which are 100, 500, 1000 and 2000 rupee notes. The experiments showed good classification results and proved that the proposed methodology has the capability of separating classes properly in varying image conditions.

On-line Lao Handwritten Recognition with Proportional Invariant Feature

This paper proposed high level feature for online Lao handwritten recognition. This feature must be high level enough so that the feature is not change when characters are written by different persons at different speed and different proportion (shorter or longer stroke, head, tail, loop, curve). In this high level feature, a character is divided in to sequence of curve segments where a segment start where curve reverse rotation (counter clockwise and clockwise). In each segment, following features are gathered cumulative change in direction of curve (- for clockwise), cumulative curve length, cumulative length of left to right, right to left, top to bottom and bottom to top ( cumulative change in X and Y axis of segment). This feature is simple yet robust for high accuracy recognition. The feature can be gather from parsing the original time sampling sequence X, Y point of the pen location without re-sampling. We also experiment on other segmentation point such as the maximum curvature point which was widely used by other researcher. Experiments results show that the recognition rates are at 94.62% in comparing to using maximum curvature point 75.07%. This is due to a lot of variations of turning points in handwritten.

The Haar Wavelet Transform of the DNA Signal Representation

The Deoxyribonucleic Acid (DNA) which is a doublestranded helix of nucleotides consists of: Adenine (A), Cytosine (C), Guanine (G) and Thymine (T). In this work, we convert this genetic code into an equivalent digital signal representation. Applying a wavelet transform, such as Haar wavelet, we will be able to extract details that are not so clear in the original genetic code. We compare between different organisms using the results of the Haar wavelet Transform. This is achieved by using the trend part of the signal since the trend part bears the most energy of the digital signal representation. Consequently, we will be able to quantitatively reconstruct different biological families.

Optimizing of Fuzzy C-Means Clustering Algorithm Using GA

Fuzzy C-means Clustering algorithm (FCM) is a method that is frequently used in pattern recognition. It has the advantage of giving good modeling results in many cases, although, it is not capable of specifying the number of clusters by itself. In FCM algorithm most researchers fix weighting exponent (m) to a conventional value of 2 which might not be the appropriate for all applications. Consequently, the main objective of this paper is to use the subtractive clustering algorithm to provide the optimal number of clusters needed by FCM algorithm by optimizing the parameters of the subtractive clustering algorithm by an iterative search approach and then to find an optimal weighting exponent (m) for the FCM algorithm. In order to get an optimal number of clusters, the iterative search approach is used to find the optimal single-output Sugenotype Fuzzy Inference System (FIS) model by optimizing the parameters of the subtractive clustering algorithm that give minimum least square error between the actual data and the Sugeno fuzzy model. Once the number of clusters is optimized, then two approaches are proposed to optimize the weighting exponent (m) in the FCM algorithm, namely, the iterative search approach and the genetic algorithms. The above mentioned approach is tested on the generated data from the original function and optimal fuzzy models are obtained with minimum error between the real data and the obtained fuzzy models.