Abstract: In this paper we present a method for gene ranking
from DNA microarray data. More precisely, we calculate the correlation
networks, which are unweighted and undirected graphs, from
microarray data of cervical cancer whereas each network represents
a tissue of a certain tumor stage and each node in the network
represents a gene. From these networks we extract one tree for
each gene by a local decomposition of the correlation network. The
interpretation of a tree is that it represents the n-nearest neighbor
genes on the n-th level of a tree, measured by the Dijkstra distance,
and, hence, gives the local embedding of a gene within the correlation
network. For the obtained trees we measure the pairwise similarity
between trees rooted by the same gene from normal to cancerous
tissues. This evaluates the modification of the tree topology due to
progression of the tumor. Finally, we rank the obtained similarity
values from all tissue comparisons and select the top ranked genes.
For these genes the local neighborhood in the correlation networks
changes most between normal and cancerous tissues. As a result
we find that the top ranked genes are candidates suspected to be
involved in tumor growth and, hence, indicates that our method
captures essential information from the underlying DNA microarray
data of cervical cancer.
Abstract: Versatile dual-mode class-AB CMOS four-quadrant
analog multiplier circuit is presented. The dual translinear loops and
current mirrors are the basic building blocks in realization scheme.
This technique provides; wide dynamic range, wide-bandwidth response
and low power consumption. The major advantages of this
approach are; its has single ended inputs; since its input is dual translinear
loop operate in class-AB mode which make this multiplier
configuration interesting for low-power applications; current multiplying,
voltage multiplying, or current and voltage multiplying can
be obtainable with balanced input. The simulation results of versatile
analog multiplier demonstrate a linearity error of 1.2 %, a -3dB bandwidth
of about 19MHz, a maximum power consumption of 0.46mW,
and temperature compensated. Operation of versatile analog multiplier
was also confirmed through an experiment using CMOS transistor
array.
Abstract: Parsing is important in Linguistics and Natural
Language Processing to understand the syntax and semantics of a
natural language grammar. Parsing natural language text is
challenging because of the problems like ambiguity and inefficiency.
Also the interpretation of natural language text depends on context
based techniques. A probabilistic component is essential to resolve
ambiguity in both syntax and semantics thereby increasing accuracy
and efficiency of the parser. Tamil language has some inherent
features which are more challenging. In order to obtain the solutions,
lexicalized and statistical approach is to be applied in the parsing
with the aid of a language model. Statistical models mainly focus on
semantics of the language which are suitable for large vocabulary
tasks where as structural methods focus on syntax which models
small vocabulary tasks. A statistical language model based on Trigram
for Tamil language with medium vocabulary of 5000 words has
been built. Though statistical parsing gives better performance
through tri-gram probabilities and large vocabulary size, it has some
disadvantages like focus on semantics rather than syntax, lack of
support in free ordering of words and long term relationship. To
overcome the disadvantages a structural component is to be
incorporated in statistical language models which leads to the
implementation of hybrid language models. This paper has attempted
to build phrase structured hybrid language model which resolves
above mentioned disadvantages. In the development of hybrid
language model, new part of speech tag set for Tamil language has
been developed with more than 500 tags which have the wider
coverage. A phrase structured Treebank has been developed with 326
Tamil sentences which covers more than 5000 words. A hybrid
language model has been trained with the phrase structured Treebank
using immediate head parsing technique. Lexicalized and statistical
parser which employs this hybrid language model and immediate
head parsing technique gives better results than pure grammar and
trigram based model.
Abstract: The Taiwan government has started to promote the “Plain Landscape Afforestation and Greening Program" since 2002. A key task of the program was the payment for environmental services (PES), entitled the “Plain Landscape Afforestation Policy" (PLAP), which was certificated by the Executive Yuan on August 31, 2001 and enacted on January 1, 2002. According to the policy, it is estimated that the total area of afforestation will be 25,100 hectares by December 31, 2007. Until the end of 2007, the policy had been enacted for six years in total and the actual area of afforestation was 8,919.18 hectares. Among them, Taiwan Sugar Corporation (TSC) was accounted for 7,960 hectares (with 2,450.83 hectares as public service area) which occupied 86.22% of the total afforestation area; the private farmland promoted by local governments was accounted for 869.18 hectares which occupied 9.75% of the total afforestation area. Based on the above, we observe that most of the afforestation area in this policy is executed by TSC, and the achievement ratio by TSC is better than by others. It implies that the success of the PLAP is seriously related to the execution of TSC. The objective of this study is to analyze the relevant policy planning of TSC-s participation in the PLAP, suggest complementary measures, and draw up effective adjustment mechanisms, so as to improve the effectiveness of executing the policy. Our main conclusions and suggestions are summarized as follows: 1. The main reason for TSC-s participation in the PLAP is based on their passive cooperation with the central government or company policy. Prior to TSC-s participation in the PLAP, their lands were mainly used for growing sugarcane. 2. The main factors of TSC-s consideration on the selection of tree species are based on the suitability of land and species. The largest proportion of tree species is allocated to economic forests, and the lack of technical instruction was the main problem during afforestation. Moreover, the method of improving TSC-s future development in leisure agriculture and landscape business becomes a key topic. 3. TSC has developed short and long-term plans on participating in the PLAP for the future. However, there is no great willingness or incentive on budgeting for such detailed planning. 4. Most people from TSC interviewed consider the requirements on PLAP unreasonable. Among them, an unreasonable requirement on the number of trees accounted for the greatest proportion; furthermore, most interviewees suggested that the government should continue to provide incentives even after 20 years. 5. Since the government shares the same goals as TSC, there should be sufficient cooperation and communication that support the technical instruction and reduction of afforestation cost, which will also help to improve effectiveness of the policy.
Abstract: The production of a plant can be measured in terms of
seeds. The generation of seeds plays a critical role in our social and
daily life. The fruit production which generates seeds, depends on the
various parameters of the plant, such as shoot length, leaf number,
root length, root number, etc When the plant is growing, some leaves
may be lost and some new leaves may appear. It is very difficult to
use the number of leaves of the tree to calculate the growth of the
plant.. It is also cumbersome to measure the number of roots and
length of growth of root in several time instances continuously after
certain initial period of time, because roots grow deeper and deeper
under ground in course of time. On the contrary, the shoot length of
the tree grows in course of time which can be measured in different
time instances. So the growth of the plant can be measured using the
data of shoot length which are measured at different time instances
after plantation. The environmental parameters like temperature, rain
fall, humidity and pollution are also play some role in production of
yield. The soil, crop and distance management are taken care to
produce maximum amount of yields of plant. The data of the growth
of shoot length of some mustard plant at the initial stage (7,14,21 &
28 days after plantation) is available from the statistical survey by a
group of scientists under the supervision of Prof. Dilip De. In this
paper, initial shoot length of Ken( one type of mustard plant) has
been used as an initial data. The statistical models, the methods of
fuzzy logic and neural network have been tested on this mustard
plant and based on error analysis (calculation of average error) that
model with minimum error has been selected and can be used for the
assessment of shoot length at maturity. Finally, all these methods
have been tested with other type of mustard plants and the particular
soft computing model with the minimum error of all types has been
selected for calculating the predicted data of growth of shoot length.
The shoot length at the stage of maturity of all types of mustard
plants has been calculated using the statistical method on the
predicted data of shoot length.
Abstract: Maintenance is one of the most important activities in
the shipyard industry. However, sometimes it is not supported by
adequate services from the shipyard, where inaccuracy in estimating
the duration of the ship maintenance is still common. This makes
estimation of ship maintenance duration is crucial. This study uses
Data Mining approach, i.e., CART (Classification and Regression
Tree) to estimate the duration of ship maintenance that is limited to
dock works or which is known as dry docking. By using the volume
of dock works as an input to estimate the maintenance duration, 4
classes of dry docking duration were obtained with different linear
model and job criteria for each class. These linear models can then be
used to estimate the duration of dry docking based on job criteria.
Abstract: This paper proposes the numerical simulation of the
investment casting of gold jewelry. It aims to study the behavior of
fluid flow during mould filling and solidification and to optimize the
process parameters, which lead to predict and control casting defects
such as gas porosity and shrinkage porosity. A finite difference
method, computer simulation software FLOW-3D was used to
simulate the jewelry casting process. The simplified model was
designed for both numerical simulation and real casting production.
A set of sensor acquisitions were allocated on the different positions
of the wax tree of the model to detect filling times, while a set of
thermocouples were allocated to detect the temperature during
casting and cooling. Those detected data were applied to validate the
results of the numerical simulation to the results of the real casting.
The resulting comparisons signify that the numerical simulation can
be used as an effective tool in investment-casting-process
optimization and casting-defect prediction.
Abstract: It is well known that Logistic Regression is the gold
standard method for predicting clinical outcome, especially
predicting risk of mortality. In this paper, the Decision Tree method
has been proposed to solve specific problems that commonly use
Logistic Regression as a solution. The Biochemistry and
Haematology Outcome Model (BHOM) dataset obtained from
Portsmouth NHS Hospital from 1 January to 31 December 2001 was
divided into four subsets. One subset of training data was used to
generate a model, and the model obtained was then applied to three
testing datasets. The performance of each model from both methods
was then compared using calibration (the χ2 test or chi-test) and
discrimination (area under ROC curve or c-index). The experiment
presented that both methods have reasonable results in the case of the
c-index. However, in some cases the calibration value (χ2) obtained
quite a high result. After conducting experiments and investigating
the advantages and disadvantages of each method, we can conclude
that Decision Trees can be seen as a worthy alternative to Logistic
Regression in the area of Data Mining.
Abstract: Availability of high dimensional biological datasets such as from gene expression, proteomic, and metabolic experiments can be leveraged for the diagnosis and prognosis of diseases. Many classification methods in this area have been studied to predict disease states and separate between predefined classes such as patients with a special disease versus healthy controls. However, most of the existing research only focuses on a specific dataset. There is a lack of generic comparison between classifiers, which might provide a guideline for biologists or bioinformaticians to select the proper algorithm for new datasets. In this study, we compare the performance of popular classifiers, which are Support Vector Machine (SVM), Logistic Regression, k-Nearest Neighbor (k-NN), Naive Bayes, Decision Tree, and Random Forest based on mock datasets. We mimic common biological scenarios simulating various proportions of real discriminating biomarkers and different effect sizes thereof. The result shows that SVM performs quite stable and reaches a higher AUC compared to other methods. This may be explained due to the ability of SVM to minimize the probability of error. Moreover, Decision Tree with its good applicability for diagnosis and prognosis shows good performance in our experimental setup. Logistic Regression and Random Forest, however, strongly depend on the ratio of discriminators and perform better when having a higher number of discriminators.
Abstract: In this article, by using fuzzy AHP and TOPSIS
technique we propose a new method for project selection problem.
After reviewing four common methods of comparing alternatives
investment (net present value, rate of return, benefit cost analysis
and payback period) we use them as criteria in AHP tree. In this
methodology by utilizing improved Analytical Hierarchy Process
by Fuzzy set theory, first we try to calculate weight of each
criterion. Then by implementing TOPSIS algorithm, assessment of
projects has been done. Obtained results have been tested in a
numerical example.
Abstract: This research is a comparative study of complexity, as a multidimensional concept, in the context of streetscape composition in Algeria and Japan. 80 streetscapes visual arrays have been collected and then presented to 20 participants, with different cultural backgrounds, in order to be categorized and classified according to their degrees of complexity. Three analysis methods have been used in this research: cluster analysis, ranking method and Hayashi Quantification method (Method III). The results showed that complexity, disorder, irregularity and disorganization are often conflicting concepts in the urban context. Algerian daytime streetscapes seem to be balanced, ordered and regular, and Japanese daytime streetscapes seem to be unbalanced, regular and vivid. Variety, richness and irregularity with some aspects of order and organization seem to characterize Algerian night streetscapes. Japanese night streetscapes seem to be more related to balance, regularity, order and organization with some aspects of confusion and ambiguity. Complexity characterized mainly Algerian avenues with green infrastructure. Therefore, for Japanese participants, Japanese traditional night streetscapes were complex. And for foreigners, Algerian and Japanese avenues nightscapes were the most complex visual arrays.
Abstract: In this paper, we propose a novel improvement for the generalized Lloyd Algorithm (GLA). Our algorithm makes use of an M-tree index built on the codebook which makes it possible to reduce the number of distance computations when the nearest code words are searched. Our method does not impose the use of any specific distance function, but works with any metric distance, making it more general than many other fast GLA variants. Finally, we present the positive results of our performance experiments.
Abstract: Stuck-pipe in drilling operations is one of the most
pressing and expensive problems in the oil industry. This paper
describes a computational simulation and an experimental study of
the hydrodynamic vibrator, which may be used for liquidation of
stuck-pipe problems during well drilling. The work principle of the
vibrator is based upon the known phenomena of Vortex Street of
Karman and the resulting generation of vibrations. We will discuss
the computational simulation and experimental investigations of
vibrations in this device. The frequency of the vibration parameters
has been measured as a function of the wide range Reynolds Number.
The validity of the computational simulation and of the assumptions
on which it is based has been proved experimentally. The
computational simulation of the vibrator work and its effectiveness
was carried out using FLUENT software. The research showed high
degree of congruence with the results of the laboratory tests and
allowed to determine the effect of the granular material features upon
the pipe vibration in the well. This study demonstrates the potential
of using the hydrodynamic vibrator in a well drilling system.