Abstract: Predicting the actual cost and duration in construction projects concern a continuous and existing problem for the construction sector. This paper addresses this problem with modern methods and data available from past public construction projects. 39 bridge projects, constructed in Greece, with a similar type of available data were examined. Considering each project’s attributes with the actual cost and the actual duration, correlation analysis is performed and the most appropriate predictive project variables are defined. Additionally, the most efficient subgroup of variables is selected with the use of the WEKA application, through its attribute selection function. The selected variables are used as input neurons for neural network models through correlation analysis. For constructing neural network models, the application FANN Tool is used. The optimum neural network model, for predicting the actual cost, produced a mean squared error with a value of 3.84886e-05 and it was based on the budgeted cost and the quantity of deck concrete. The optimum neural network model, for predicting the actual duration, produced a mean squared error with a value of 5.89463e-05 and it also was based on the budgeted cost and the amount of deck concrete.
Abstract: Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.
Abstract: Industrial Engineering is a broad multidisciplinary field with intersections and applications in numerous areas. When designing a product, it is important to determine the appropriate attributes of value and the preference function for which the product is optimized. This paper provides some guidelines on appropriate selection of attributes for preference and value functions for engineering design.
Abstract: The most important subtype of non-Hodgkin-s
lymphoma is the Diffuse Large B-Cell Lymphoma. Approximately
40% of the patients suffering from it respond well to therapy,
whereas the remainder needs a more aggressive treatment, in order to
better their chances of survival. Data Mining techniques have helped
to identify the class of the lymphoma in an efficient manner. Despite
that, thousands of genes should be processed to obtain the results.
This paper presents a comparison of the use of various attribute
selection methods aiming to reduce the number of genes to be
searched, looking for a more effective procedure as a whole.
Abstract: In this paper, a new learning approach for network
intrusion detection using naïve Bayesian classifier and ID3 algorithm
is presented, which identifies effective attributes from the training
dataset, calculates the conditional probabilities for the best attribute
values, and then correctly classifies all the examples of training and
testing dataset. Most of the current intrusion detection datasets are
dynamic, complex and contain large number of attributes. Some of
the attributes may be redundant or contribute little for detection
making. It has been successfully tested that significant attribute
selection is important to design a real world intrusion detection
systems (IDS). The purpose of this study is to identify effective
attributes from the training dataset to build a classifier for network
intrusion detection using data mining algorithms. The experimental
results on KDD99 benchmark intrusion detection dataset demonstrate
that this new approach achieves high classification rates and reduce
false positives using limited computational resources.