Information Retrieval in Domain Specific Search Engine with Machine Learning Approaches

As the web continues to grow exponentially, the idea of crawling the entire web on a regular basis becomes less and less feasible, so the need to include information on specific domain, domain-specific search engines was proposed. As more information becomes available on the World Wide Web, it becomes more difficult to provide effective search tools for information access. Today, people access web information through two main kinds of search interfaces: Browsers (clicking and following hyperlinks) and Query Engines (queries in the form of a set of keywords showing the topic of interest) [2]. Better support is needed for expressing one's information need and returning high quality search results by web search tools. There appears to be a need for systems that do reasoning under uncertainty and are flexible enough to recover from the contradictions, inconsistencies, and irregularities that such reasoning involves. In a multi-view problem, the features of the domain can be partitioned into disjoint subsets (views) that are sufficient to learn the target concept. Semi-supervised, multi-view algorithms, which reduce the amount of labeled data required for learning, rely on the assumptions that the views are compatible and uncorrelated. This paper describes the use of semi-structured machine learning approach with Active learning for the “Domain Specific Search Engines". A domain-specific search engine is “An information access system that allows access to all the information on the web that is relevant to a particular domain. The proposed work shows that with the help of this approach relevant data can be extracted with the minimum queries fired by the user. It requires small number of labeled data and pool of unlabelled data on which the learning algorithm is applied to extract the required data.

DCBOR: A Density Clustering Based on Outlier Removal

Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultaneously. This algorithm does not need to update the distance matrix, since the algorithm depends on merging the most k-nearest objects in one step and the cluster continues grow as long as possible under specified condition. So the algorithm consists of two phases; at the first phase, it removes the outliers from the input dataset. At the second phase, it performs the clustering process. This algorithm discovers clusters of different shapes, sizes, densities and requires only one input parameter; this parameter represents a threshold for outlier points. The value of the input parameter is ranging from 0 to 1. The algorithm supports the user in determining an appropriate value for it. We have tested this algorithm on different datasets contain outlier and connecting clusters by chain of density points, and the algorithm discovers the correct clusters. The results of our experiments demonstrate the effectiveness and the efficiency of DCBOR.

Integrating Fast Karnough Map and Modular Neural Networks for Simplification and Realization of Complex Boolean Functions

In this paper a new fast simplification method is presented. Such method realizes Karnough map with large number of variables. In order to accelerate the operation of the proposed method, a new approach for fast detection of group of ones is presented. Such approach implemented in the frequency domain. The search operation relies on performing cross correlation in the frequency domain rather than time one. It is proved mathematically and practically that the number of computation steps required for the presented method is less than that needed by conventional cross correlation. Simulation results using MATLAB confirm the theoretical computations. Furthermore, a powerful solution for realization of complex functions is given. The simplified functions are implemented by using a new desigen for neural networks. Neural networks are used because they are fault tolerance and as a result they can recognize signals even with noise or distortion. This is very useful for logic functions used in data and computer communications. Moreover, the implemented functions are realized with minimum amount of components. This is done by using modular neural nets (MNNs) that divide the input space into several homogenous regions. Such approach is applied to implement XOR function, 16 logic functions on one bit level, and 2-bit digital multiplier. Compared to previous non- modular designs, a clear reduction in the order of computations and hardware requirements is achieved.

Comparison of Anti-Shadoo Antibodies – Where is the Endogenous Shadoo protein?

Shadoo protein (Sho) was described in 2003 as the newest member of Prion protein superfamily [1]. Sho has similar structural motifs like prion protein (PrP) that is known for its central role in transmissible spongiform enchephalopathies. Although a great number of functions have been proposed, the exact physiological function of PrP is not known yet. Investigation of the function and localization of Sho may help us to understand the function of the Prion protein superfamily. Analyzing the subcellular localization of YFP-tagged forms of Sho, we detected the protein in the plasma membrane and in the nucleus of various cell lines. To reveal the localization of the endogenous protein we generated antibodies against Shadoo as well as employed commercially available anti-Shadoo antibodies: i) EG62 anti-mouse Shadoo antibody generated by Eurogentec Ltd.; ii) S-12 anti-human Shadoo antibody by Santa Cruz Biotechnology Inc.; iii) R-12 anti-mouse Shadoo antibody by Santa Cruz Biotechnology Inc.; iv) SPRN antibody against human Shadoo by Abgent Inc. We carried out immunocytochemistry on non-transfected HeLa, Zpl 2-1, Zw 3-5, GT1-1, GT1-7 and SHSY5Y cells as well as on YFP-Sho, Sho-YFP, and YFP-GPI transfected HeLa cells. Their specificity (in antibody-peptide competition assay) and co-localization (with the YFP signal) were assessed.

A Microstrip Antenna Design and Performance Analysis for RFID High Bit Rate Applications

Lately, an interest has grown greatly in the usages of RFID in an un-presidential applications. It is shown in the adaptation of major software companies such as Microsoft, IBM, and Oracle the RFID capabilities in their major software products. For example Microsoft SharePoints 2010 workflow is now fully compatible with RFID platform. In addition, Microsoft BizTalk server is also capable of all RFID sensors data acquisition. This will lead to applications that required high bit rate, long range and a multimedia content in nature. Higher frequencies of operation have been designated for RFID tags, among them are the 2.45 and 5.8 GHz. The higher the frequency means higher range, and higher bit rate, but the drawback is the greater cost. In this paper we present a single layer, low profile patch antenna operates at 5.8 GHz with pure resistive input impedance of 50 and close to directive radiation. Also, we propose a modification to the design in order to improve the operation band width from 8.7 to 13.8

Joint Design of MIMO Relay Networks Based on MMSE Criterion

This paper deals with wireless relay communication systems in which multiple sources transmit information to the destination node by the help of multiple relays. We consider a signal forwarding technique based on the minimum mean-square error (MMSE) approach with multiple antennas for each relay. A source-relay-destination joint design strategy is proposed with power constraints at the destination and the source nodes. Simulation results confirm that the proposed joint design method improves the average MSE performance compared with that of conventional MMSE relaying schemes.

A Fast Replica Placement Methodology for Large-scale Distributed Computing Systems

Fine-grained data replication over the Internet allows duplication of frequently accessed data objects, as opposed to entire sites, to certain locations so as to improve the performance of largescale content distribution systems. In a distributed system, agents representing their sites try to maximize their own benefit since they are driven by different goals such as to minimize their communication costs, latency, etc. In this paper, we will use game theoretical techniques and in particular auctions to identify a bidding mechanism that encapsulates the selfishness of the agents, while having a controlling hand over them. In essence, the proposed game theory based mechanism is the study of what happens when independent agents act selfishly and how to control them to maximize the overall performance. A bidding mechanism asks how one can design systems so that agents- selfish behavior results in the desired system-wide goals. Experimental results reveal that this mechanism provides excellent solution quality, while maintaining fast execution time. The comparisons are recorded against some well known techniques such as greedy, branch and bound, game theoretical auctions and genetic algorithms.

An Approach for Transient Response Calculation of large Nonproportionally Damped Structures using Component Mode Synthesis

A minimal complexity version of component mode synthesis is presented that requires simplified computer programming, but still provides adequate accuracy for modeling lower eigenproperties of large structures and their transient responses. The novelty is that a structural separation into components is done along a plane/surface that exhibits rigid-like behavior, thus only normal modes of each component is sufficient to use, without computing any constraint, attachment, or residual-attachment modes. The approach requires only such input information as a few (lower) natural frequencies and corresponding undamped normal modes of each component. A novel technique is shown for formulation of equations of motion, where a double transformation to generalized coordinates is employed and formulation of nonproportional damping matrix in generalized coordinates is shown.

A Renovated Cook's Distance Based On The Buckley-James Estimate In Censored Regression

There have been various methods created based on the regression ideas to resolve the problem of data set containing censored observations, i.e. the Buckley-James method, Miller-s method, Cox method, and Koul-Susarla-Van Ryzin estimators. Even though comparison studies show the Buckley-James method performs better than some other methods, it is still rarely used by researchers mainly because of the limited diagnostics analysis developed for the Buckley-James method thus far. Therefore, a diagnostic tool for the Buckley-James method is proposed in this paper. It is called the renovated Cook-s Distance, (RD* i ) and has been developed based on the Cook-s idea. The renovated Cook-s Distance (RD* i ) has advantages (depending on the analyst demand) over (i) the change in the fitted value for a single case, DFIT* i as it measures the influence of case i on all n fitted values Yˆ∗ (not just the fitted value for case i as DFIT* i) (ii) the change in the estimate of the coefficient when the ith case is deleted, DBETA* i since DBETA* i corresponds to the number of variables p so it is usually easier to look at a diagnostic measure such as RD* i since information from p variables can be considered simultaneously. Finally, an example using Stanford Heart Transplant data is provided to illustrate the proposed diagnostic tool.

Study on Specific Energy in Grinding of DRACs: A Response Surface Methodology Approach

In this study, the effects of machining parameters on specific energy during surface grinding of 6061Al-SiC35P composites are investigated. Vol% of SiC, feed and depth of cut were chosen as process variables. The power needed for the calculation of the specific energy is measured from the two watt meter method. Experiments are conducted using standard RSM design called Central composite design (CCD). A second order response surface model was developed for specific energy. The results identify the significant influence factors to minimize the specific energy. The confirmation results demonstrate the practicability and effectiveness of the proposed approach.

All Proteins Have a Basic Molecular Formula

This study proposes a basic molecular formula for all proteins. A total of 10,739 proteins belonging to 9 different protein groups classified on the basis of their functions were selected randomly. They included enzymes, storage proteins, hormones, signalling proteins, structural proteins, transport proteins, immunoglobulins or antibodies, motor proteins and receptor proteins. After obtaining the protein molecular formula using the ProtParam tool, the H/C, N/C, O/C, and S/C ratios were determined for each randomly selected sample. In this case, H, N, O, and S coefficients were specified per carbon atom. Surprisingly, the results demonstrated that H, N, O, and S coefficients for all 10,739 proteins are similar and highly correlated. This study demonstrates that despite differences in the structure and function, all known proteins have a similar basic molecular formula CnH1.58 ± 0.015nN0.28 ± 0.005nO0.30 ± 0.007nS0.01 ± 0.002n. The total correlation between all coefficients was found to be 0.9999.

Adaptation Learning Speed Control for a High- Performance Induction Motor using Neural Networks

This paper proposes an effective adaptation learning algorithm based on artificial neural networks for speed control of an induction motor assumed to operate in a high-performance drives environment. The structure scheme consists of a neural network controller and an algorithm for changing the NN weights in order that the motor speed can accurately track of the reference command. This paper also makes uses a very realistic and practical scheme to estimate and adaptively learn the noise content in the speed load torque characteristic of the motor. The availability of the proposed controller is verified by through a laboratory implementation and under computation simulations with Matlab-software. The process is also tested for the tracking property using different types of reference signals. The performance and robustness of the proposed control scheme have evaluated under a variety of operating conditions of the induction motor drives. The obtained results demonstrate the effectiveness of the proposed control scheme system performances, both in steady state error in speed and dynamic conditions, was found to be excellent and those is not overshoot.

Self-Adaptive Differential Evolution Based Power Economic Dispatch of Generators with Valve-Point Effects and Multiple Fuel Options

This paper presents the solution of power economic dispatch (PED) problem of generating units with valve point effects and multiple fuel options using Self-Adaptive Differential Evolution (SDE) algorithm. The global optimal solution by mathematical approaches becomes difficult for the realistic PED problem in power systems. The Differential Evolution (DE) algorithm is found to be a powerful evolutionary algorithm for global optimization in many real problems. In this paper the key parameters of control in DE algorithm such as the crossover constant CR and weight applied to random differential F are self-adapted. The PED problem formulation takes into consideration of nonsmooth fuel cost function due to valve point effects and multi fuel options of generator. The proposed approach has been examined and tested with the numerical results of PED problems with thirteen-generation units including valve-point effects, ten-generation units with multiple fuel options neglecting valve-point effects and ten-generation units including valve-point effects and multiple fuel options. The test results are promising and show the effectiveness of proposed approach for solving PED problems.

A Combined Conventional and Differential Evolution Method for Model Order Reduction

In this paper a mixed method by combining an evolutionary and a conventional technique is proposed for reduction of Single Input Single Output (SISO) continuous systems into Reduced Order Model (ROM). In the conventional technique, the mixed advantages of Mihailov stability criterion and continued Fraction Expansions (CFE) technique is employed where the reduced denominator polynomial is derived using Mihailov stability criterion and the numerator is obtained by matching the quotients of the Cauer second form of Continued fraction expansions. Then, retaining the numerator polynomial, the denominator polynomial is recalculated by an evolutionary technique. In the evolutionary method, the recently proposed Differential Evolution (DE) optimization technique is employed. DE method is based on the minimization of the Integral Squared Error (ISE) between the transient responses of original higher order model and the reduced order model pertaining to a unit step input. The proposed method is illustrated through a numerical example and compared with ROM where both numerator and denominator polynomials are obtained by conventional method to show its superiority.

Intelligent Agent Approach to the Control of Critical Infrastructure Networks

In this paper we propose an intelligent agent approach to control the electric power grid at a smaller granularity in order to give it self-healing capabilities. We develop a method using the influence model to transform transmission substations into information processing, analyzing and decision making (intelligent behavior) units. We also develop a wireless communication method to deliver real-time uncorrupted information to an intelligent controller in a power system environment. A combined networking and information theoretic approach is adopted in meeting both the delay and error probability requirements. We use a mobile agent approach in optimizing the achievable information rate vector and in the distribution of rates to users (sensors). We developed the concept and the quantitative tools require in the creation of cooperating semiautonomous subsystems which puts the electric grid on the path towards intelligent and self-healing system.

Analysis of the Shielding Effectiveness of Several Magnetic Shields

Today with the rapid growth of telecommunications equipment, electronic and developing more and more networks of power, influence of electromagnetic waves on one another has become hot topic discussions. So in this article, this issue and appropriate mechanisms for EMC operations have been presented. First, a source of alternating current (50 Hz) and a clear victim in a certain distance from the source is placed. With this simple model, the effects of electromagnetic radiation from the source to the victim will be investigated and several methods to reduce these effects have been presented. Therefore passive and active shields have been used. In some steps, shielding effectiveness of proposed shields will be compared. . It should be noted that simulations have been done by the finite element method (FEM).

Javanese Character Recognition Using Hidden Markov Model

Hidden Markov Model (HMM) is a stochastic method which has been used in various signal processing and character recognition. This study proposes to use HMM to recognize Javanese characters from a number of different handwritings, whereby HMM is used to optimize the number of state and feature extraction. An 85.7 % accuracy is obtained as the best result in 16-stated vertical model using pure HMM. This initial result is satisfactory for prompting further research.

Geochemistry of Tektites from Maoming of Guandong Province, China

We measured the major and trace element contents and Rb-Sr isotopic compositions of 12 tektites from the Maoming area, Guandong province (south China). All the samples studied are splash-form tektites which show pitted or grooved surfaces with schlieren structures on some surfaces. The trace element ratios Ba/Rb (avg. 4.33), Th/Sm (avg. 2.31), Sm/Sc (avg. 0.44), Th/Sc (avg. 1.01) , La/Sc (avg. 2.86), Th/U (avg. 7.47), Zr/Hf (avg. 46.01) and the rare earth elements (REE) contents of tektites of this study are similar to the average upper continental crust. From the chemical composition, it is suggested that tektites in this study are derived from similar parental terrestrial sedimentary deposit which may be related to post-Archean upper crustal rocks. The tektites from the Maoming area have high positive εSr(0) values-ranging from 176.9~190.5 which indicate that the parental material for these tektites have similar Sr isotopic compositions to old terrestrial sedimentary rocks and they were not dominantly derived from recent young sediments (such as soil or loess). The Sr isotopic data obtained by the present study support the conclusion proposed by Blum et al. (1992)[1] that the depositional age of sedimentary target materials is close to 170Ma (Jurassic). Mixing calculations based on the model proposed by Ho and Chen (1996)[2] for various amounts and combinations of target rocks indicate that the best fit for tektites from the Maoming area is a mixture of 40% shale, 30% greywacke, 30% quartzite.

Processing Web-Cam Images by a Neuro-Fuzzy Approach for Vehicular Traffic Monitoring

Traffic management in an urban area is highly facilitated by the knowledge of the traffic conditions in every street or highway involved in the vehicular mobility system. Aim of the paper is to propose a neuro-fuzzy approach able to compute the main parameters of a traffic system, i.e., car density, velocity and flow, by using the images collected by the web-cams located at the crossroads of the traffic network. The performances of this approach encourage its application when the traffic system is far from the saturation. A fuzzy model is also outlined to evaluate when it is suitable to use more accurate, even if more time consuming, algorithms for measuring traffic conditions near to saturation.

Genetic Programming Approach to Hierarchical Production Rule Discovery

Automated discovery of hierarchical structures in large data sets has been an active research area in the recent past. This paper focuses on the issue of mining generalized rules with crisp hierarchical structure using Genetic Programming (GP) approach to knowledge discovery. The post-processing scheme presented in this work uses flat rules as initial individuals of GP and discovers hierarchical structure. Suitable genetic operators are proposed for the suggested encoding. Based on the Subsumption Matrix(SM), an appropriate fitness function is suggested. Finally, Hierarchical Production Rules (HPRs) are generated from the discovered hierarchy. Experimental results are presented to demonstrate the performance of the proposed algorithm.