A Systems Approach to Gene Ranking from DNA Microarray Data of Cervical Cancer

In this paper we present a method for gene ranking from DNA microarray data. More precisely, we calculate the correlation networks, which are unweighted and undirected graphs, from microarray data of cervical cancer whereas each network represents a tissue of a certain tumor stage and each node in the network represents a gene. From these networks we extract one tree for each gene by a local decomposition of the correlation network. The interpretation of a tree is that it represents the n-nearest neighbor genes on the n-th level of a tree, measured by the Dijkstra distance, and, hence, gives the local embedding of a gene within the correlation network. For the obtained trees we measure the pairwise similarity between trees rooted by the same gene from normal to cancerous tissues. This evaluates the modification of the tree topology due to progression of the tumor. Finally, we rank the obtained similarity values from all tissue comparisons and select the top ranked genes. For these genes the local neighborhood in the correlation networks changes most between normal and cancerous tissues. As a result we find that the top ranked genes are candidates suspected to be involved in tumor growth and, hence, indicates that our method captures essential information from the underlying DNA microarray data of cervical cancer.

Versatile Dual-Mode Class-AB Four-Quadrant Analog Multiplier

Versatile dual-mode class-AB CMOS four-quadrant analog multiplier circuit is presented. The dual translinear loops and current mirrors are the basic building blocks in realization scheme. This technique provides; wide dynamic range, wide-bandwidth response and low power consumption. The major advantages of this approach are; its has single ended inputs; since its input is dual translinear loop operate in class-AB mode which make this multiplier configuration interesting for low-power applications; current multiplying, voltage multiplying, or current and voltage multiplying can be obtainable with balanced input. The simulation results of versatile analog multiplier demonstrate a linearity error of 1.2 %, a -3dB bandwidth of about 19MHz, a maximum power consumption of 0.46mW, and temperature compensated. Operation of versatile analog multiplier was also confirmed through an experiment using CMOS transistor array.

Structural Parsing of Natural Language Text in Tamil Using Phrase Structure Hybrid Language Model

Parsing is important in Linguistics and Natural Language Processing to understand the syntax and semantics of a natural language grammar. Parsing natural language text is challenging because of the problems like ambiguity and inefficiency. Also the interpretation of natural language text depends on context based techniques. A probabilistic component is essential to resolve ambiguity in both syntax and semantics thereby increasing accuracy and efficiency of the parser. Tamil language has some inherent features which are more challenging. In order to obtain the solutions, lexicalized and statistical approach is to be applied in the parsing with the aid of a language model. Statistical models mainly focus on semantics of the language which are suitable for large vocabulary tasks where as structural methods focus on syntax which models small vocabulary tasks. A statistical language model based on Trigram for Tamil language with medium vocabulary of 5000 words has been built. Though statistical parsing gives better performance through tri-gram probabilities and large vocabulary size, it has some disadvantages like focus on semantics rather than syntax, lack of support in free ordering of words and long term relationship. To overcome the disadvantages a structural component is to be incorporated in statistical language models which leads to the implementation of hybrid language models. This paper has attempted to build phrase structured hybrid language model which resolves above mentioned disadvantages. In the development of hybrid language model, new part of speech tag set for Tamil language has been developed with more than 500 tags which have the wider coverage. A phrase structured Treebank has been developed with 326 Tamil sentences which covers more than 5000 words. A hybrid language model has been trained with the phrase structured Treebank using immediate head parsing technique. Lexicalized and statistical parser which employs this hybrid language model and immediate head parsing technique gives better results than pure grammar and trigram based model.

2Taiwan Public Corporation's Participation in the Mechanism of Payment for Environmental Services

The Taiwan government has started to promote the “Plain Landscape Afforestation and Greening Program" since 2002. A key task of the program was the payment for environmental services (PES), entitled the “Plain Landscape Afforestation Policy" (PLAP), which was certificated by the Executive Yuan on August 31, 2001 and enacted on January 1, 2002. According to the policy, it is estimated that the total area of afforestation will be 25,100 hectares by December 31, 2007. Until the end of 2007, the policy had been enacted for six years in total and the actual area of afforestation was 8,919.18 hectares. Among them, Taiwan Sugar Corporation (TSC) was accounted for 7,960 hectares (with 2,450.83 hectares as public service area) which occupied 86.22% of the total afforestation area; the private farmland promoted by local governments was accounted for 869.18 hectares which occupied 9.75% of the total afforestation area. Based on the above, we observe that most of the afforestation area in this policy is executed by TSC, and the achievement ratio by TSC is better than by others. It implies that the success of the PLAP is seriously related to the execution of TSC. The objective of this study is to analyze the relevant policy planning of TSC-s participation in the PLAP, suggest complementary measures, and draw up effective adjustment mechanisms, so as to improve the effectiveness of executing the policy. Our main conclusions and suggestions are summarized as follows: 1. The main reason for TSC-s participation in the PLAP is based on their passive cooperation with the central government or company policy. Prior to TSC-s participation in the PLAP, their lands were mainly used for growing sugarcane. 2. The main factors of TSC-s consideration on the selection of tree species are based on the suitability of land and species. The largest proportion of tree species is allocated to economic forests, and the lack of technical instruction was the main problem during afforestation. Moreover, the method of improving TSC-s future development in leisure agriculture and landscape business becomes a key topic. 3. TSC has developed short and long-term plans on participating in the PLAP for the future. However, there is no great willingness or incentive on budgeting for such detailed planning. 4. Most people from TSC interviewed consider the requirements on PLAP unreasonable. Among them, an unreasonable requirement on the number of trees accounted for the greatest proportion; furthermore, most interviewees suggested that the government should continue to provide incentives even after 20 years. 5. Since the government shares the same goals as TSC, there should be sufficient cooperation and communication that support the technical instruction and reduction of afforestation cost, which will also help to improve effectiveness of the policy.

A Frame Work for the Development of a Suitable Method to Find Shoot Length at Maturity of Mustard Plant Using Soft Computing Model

The production of a plant can be measured in terms of seeds. The generation of seeds plays a critical role in our social and daily life. The fruit production which generates seeds, depends on the various parameters of the plant, such as shoot length, leaf number, root length, root number, etc When the plant is growing, some leaves may be lost and some new leaves may appear. It is very difficult to use the number of leaves of the tree to calculate the growth of the plant.. It is also cumbersome to measure the number of roots and length of growth of root in several time instances continuously after certain initial period of time, because roots grow deeper and deeper under ground in course of time. On the contrary, the shoot length of the tree grows in course of time which can be measured in different time instances. So the growth of the plant can be measured using the data of shoot length which are measured at different time instances after plantation. The environmental parameters like temperature, rain fall, humidity and pollution are also play some role in production of yield. The soil, crop and distance management are taken care to produce maximum amount of yields of plant. The data of the growth of shoot length of some mustard plant at the initial stage (7,14,21 & 28 days after plantation) is available from the statistical survey by a group of scientists under the supervision of Prof. Dilip De. In this paper, initial shoot length of Ken( one type of mustard plant) has been used as an initial data. The statistical models, the methods of fuzzy logic and neural network have been tested on this mustard plant and based on error analysis (calculation of average error) that model with minimum error has been selected and can be used for the assessment of shoot length at maturity. Finally, all these methods have been tested with other type of mustard plants and the particular soft computing model with the minimum error of all types has been selected for calculating the predicted data of growth of shoot length. The shoot length at the stage of maturity of all types of mustard plants has been calculated using the statistical method on the predicted data of shoot length.

Estimation Model of Dry Docking Duration Using Data Mining

Maintenance is one of the most important activities in the shipyard industry. However, sometimes it is not supported by adequate services from the shipyard, where inaccuracy in estimating the duration of the ship maintenance is still common. This makes estimation of ship maintenance duration is crucial. This study uses Data Mining approach, i.e., CART (Classification and Regression Tree) to estimate the duration of ship maintenance that is limited to dock works or which is known as dry docking. By using the volume of dock works as an input to estimate the maintenance duration, 4 classes of dry docking duration were obtained with different linear model and job criteria for each class. These linear models can then be used to estimate the duration of dry docking based on job criteria.

Numerical Simulation of Investment Casting of Gold Jewelry: Experiments and Validations

This paper proposes the numerical simulation of the investment casting of gold jewelry. It aims to study the behavior of fluid flow during mould filling and solidification and to optimize the process parameters, which lead to predict and control casting defects such as gas porosity and shrinkage porosity. A finite difference method, computer simulation software FLOW-3D was used to simulate the jewelry casting process. The simplified model was designed for both numerical simulation and real casting production. A set of sensor acquisitions were allocated on the different positions of the wax tree of the model to detect filling times, while a set of thermocouples were allocated to detect the temperature during casting and cooling. Those detected data were applied to validate the results of the numerical simulation to the results of the real casting. The resulting comparisons signify that the numerical simulation can be used as an effective tool in investment-casting-process optimization and casting-defect prediction.

Decision Trees for Predicting Risk of Mortality using Routinely Collected Data

It is well known that Logistic Regression is the gold standard method for predicting clinical outcome, especially predicting risk of mortality. In this paper, the Decision Tree method has been proposed to solve specific problems that commonly use Logistic Regression as a solution. The Biochemistry and Haematology Outcome Model (BHOM) dataset obtained from Portsmouth NHS Hospital from 1 January to 31 December 2001 was divided into four subsets. One subset of training data was used to generate a model, and the model obtained was then applied to three testing datasets. The performance of each model from both methods was then compared using calibration (the χ2 test or chi-test) and discrimination (area under ROC curve or c-index). The experiment presented that both methods have reasonable results in the case of the c-index. However, in some cases the calibration value (χ2) obtained quite a high result. After conducting experiments and investigating the advantages and disadvantages of each method, we can conclude that Decision Trees can be seen as a worthy alternative to Logistic Regression in the area of Data Mining.

Evaluation of the Impact of Dataset Characteristics for Classification Problems in Biological Applications

Availability of high dimensional biological datasets such as from gene expression, proteomic, and metabolic experiments can be leveraged for the diagnosis and prognosis of diseases. Many classification methods in this area have been studied to predict disease states and separate between predefined classes such as patients with a special disease versus healthy controls. However, most of the existing research only focuses on a specific dataset. There is a lack of generic comparison between classifiers, which might provide a guideline for biologists or bioinformaticians to select the proper algorithm for new datasets. In this study, we compare the performance of popular classifiers, which are Support Vector Machine (SVM), Logistic Regression, k-Nearest Neighbor (k-NN), Naive Bayes, Decision Tree, and Random Forest based on mock datasets. We mimic common biological scenarios simulating various proportions of real discriminating biomarkers and different effect sizes thereof. The result shows that SVM performs quite stable and reaches a higher AUC compared to other methods. This may be explained due to the ability of SVM to minimize the probability of error. Moreover, Decision Tree with its good applicability for diagnosis and prognosis shows good performance in our experimental setup. Logistic Regression and Random Forest, however, strongly depend on the ratio of discriminators and perform better when having a higher number of discriminators.

Project Selection by Using Fuzzy AHP and TOPSIS Technique

In this article, by using fuzzy AHP and TOPSIS technique we propose a new method for project selection problem. After reviewing four common methods of comparing alternatives investment (net present value, rate of return, benefit cost analysis and payback period) we use them as criteria in AHP tree. In this methodology by utilizing improved Analytical Hierarchy Process by Fuzzy set theory, first we try to calculate weight of each criterion. Then by implementing TOPSIS algorithm, assessment of projects has been done. Obtained results have been tested in a numerical example.

Comparative Study of Complexity in Streetscape Composition

This research is a comparative study of complexity, as a multidimensional concept, in the context of streetscape composition in Algeria and Japan. 80 streetscapes visual arrays have been collected and then presented to 20 participants, with different cultural backgrounds, in order to be categorized and classified according to their degrees of complexity. Three analysis methods have been used in this research: cluster analysis, ranking method and Hayashi Quantification method (Method III). The results showed that complexity, disorder, irregularity and disorganization are often conflicting concepts in the urban context. Algerian daytime streetscapes seem to be balanced, ordered and regular, and Japanese daytime streetscapes seem to be unbalanced, regular and vivid. Variety, richness and irregularity with some aspects of order and organization seem to characterize Algerian night streetscapes. Japanese night streetscapes seem to be more related to balance, regularity, order and organization with some aspects of confusion and ambiguity. Complexity characterized mainly Algerian avenues with green infrastructure. Therefore, for Japanese participants, Japanese traditional night streetscapes were complex. And for foreigners, Algerian and Japanese avenues nightscapes were the most complex visual arrays.

Accelerating GLA with an M-Tree

In this paper, we propose a novel improvement for the generalized Lloyd Algorithm (GLA). Our algorithm makes use of an M-tree index built on the codebook which makes it possible to reduce the number of distance computations when the nearest code words are searched. Our method does not impose the use of any specific distance function, but works with any metric distance, making it more general than many other fast GLA variants. Finally, we present the positive results of our performance experiments.

CFD Simulation of the Hydrodynamic Vibrator for Stuck - Pipe Liquidation

Stuck-pipe in drilling operations is one of the most pressing and expensive problems in the oil industry. This paper describes a computational simulation and an experimental study of the hydrodynamic vibrator, which may be used for liquidation of stuck-pipe problems during well drilling. The work principle of the vibrator is based upon the known phenomena of Vortex Street of Karman and the resulting generation of vibrations. We will discuss the computational simulation and experimental investigations of vibrations in this device. The frequency of the vibration parameters has been measured as a function of the wide range Reynolds Number. The validity of the computational simulation and of the assumptions on which it is based has been proved experimentally. The computational simulation of the vibrator work and its effectiveness was carried out using FLUENT software. The research showed high degree of congruence with the results of the laboratory tests and allowed to determine the effect of the granular material features upon the pipe vibration in the well. This study demonstrates the potential of using the hydrodynamic vibrator in a well drilling system.