Abstract: Due to the fast and flawless technological innovation
there is a tremendous amount of data dumping all over the world in
every domain such as Pattern Recognition, Machine Learning, Spatial
Data Mining, Image Analysis, Fraudulent Analysis, World Wide
Web etc., This issue turns to be more essential for developing several
tools for data mining functionalities. The major aim of this paper is to
analyze various tools which are used to build a resourceful analytical
or descriptive model for handling large amount of information more
efficiently and user friendly. In this survey the diverse tools are
illustrated with their extensive technical paradigm, outstanding
graphical interface and inbuilt multipath algorithms in which it is
very useful for handling significant amount of data more indeed.
Abstract: Nowadays, the Web has become one of the most
pervasive platforms for information change and retrieval. It collects
the suitable and perfectly fitting information from websites that one
requires. Data mining is the form of extracting data’s available in the
internet. Web mining is one of the elements of data mining
Technique, which relates to various research communities such as
information recovery, folder managing system and simulated
intellects. In this Paper we have discussed the concepts of Web
mining. We contain generally focused on one of the categories of
Web mining, specifically the Web Content Mining and its various
farm duties. The mining tools are imperative to scanning the many
images, text, and HTML documents and then, the result is used by
the various search engines. We conclude by presenting a comparative
table of these tools based on some pertinent criteria.
Abstract: There have been a lot of efforts and researches undertaken in developing efficient tools for performing several tasks in data mining. Due to the massive amount of information embedded in huge data warehouses maintained in several domains, the extraction of meaningful pattern is no longer feasible. This issue turns to be more obligatory for developing several tools in data mining. Furthermore the major aspire of data mining software is to build a resourceful predictive or descriptive model for handling large amount of information more efficiently and user friendly. Data mining mainly contracts with excessive collection of data that inflicts huge rigorous computational constraints. These out coming challenges lead to the emergence of powerful data mining technologies. In this survey a diverse collection of data mining tools are exemplified and also contrasted with the salient features and performance behavior of each tool.
Abstract: This paper discusses the use of explorative data
mining tools that allow the educator to explore new relationships
between reported learning experiences and actual activities,
even if there are multiple dimensions with a large number
of measured items. The underlying technology is based on
the so-called Compendium Platform for Reproducible Computing
(http://www.freestatistics.org) which was built on top the computational
R Framework (http://www.wessa.net).
Abstract: Data mining (DM) is the process of finding and extracting frequent patterns that can describe the data, or predict unknown or future values. These goals are achieved by using various learning algorithms. Each algorithm may produce a mining result completely different from the others. Some algorithms may find millions of patterns. It is thus the difficult job for data analysts to select appropriate models and interpret the discovered knowledge. In this paper, we describe a framework of an intelligent and complete data mining system called SUT-Miner. Our system is comprised of a full complement of major DM algorithms, pre-DM and post-DM functionalities. It is the post-DM packages that ease the DM deployment for business intelligence applications.
Abstract: Information is power. Geographical information is an
emerging science that is advancing the development of knowledge to
further help in the understanding of the relationship of “place" with
other disciplines such as crime. The researchers used crime data for
the years 2004 to 2007 from the Baguio City Police Office to
determine the incidence and actual locations of crime hotspots.
Combined qualitative and quantitative research methodology was
employed through extensive fieldwork and observation, geographic
visualization with Geographic Information Systems (GIS) and Global
Positioning Systems (GPS), and data mining. The paper discusses
emerging geographic visualization and data mining tools and
methodologies that can be used to generate baseline data for
environmental initiatives such as urban renewal and rejuvenation.
The study was able to demonstrate that crime hotspots can be
computed and were seen to be occurring to some select places in the
Central Business District (CBD) of Baguio City. It was observed that
some characteristics of the hotspot places- physical design and milieu
may play an important role in creating opportunities for crime. A list
of these environmental attributes was generated. This derived
information may be used to guide the design or redesign of the urban
environment of the City to be able to reduce crime and at the same
time improve it physically.
Abstract: Estimation time and cost of work completion in a
project and follow up them during execution are contributors to
success or fail of a project, and is very important for project
management team. Delivering on time and within budgeted cost
needs to well managing and controlling the projects. To dealing with
complex task of controlling and modifying the baseline project
schedule during execution, earned value management systems have
been set up and widely used to measure and communicate the real
physical progress of a project. But it often fails to predict the total
duration of the project. In this paper data mining techniques is used
predicting the total project duration in term of Time Estimate At
Completion-EAC (t). For this purpose, we have used a project with
90 activities, it has updated day by day. Then, it is used regular
indexes in literature and applied Earned Duration Method to
calculate time estimate at completion and set these as input data for
prediction and specifying the major parameters among them using
Clem software. By using data mining, the effective parameters on
EAC and the relationship between them could be extracted and it is
very useful to manage a project with minimum delay risks. As we
state, this could be a simple, safe and applicable method in prediction
the completion time of a project during execution.
Abstract: Data mining incorporates a group of statistical
methods used to analyze a set of information, or a data set. It operates
with models and algorithms, which are powerful tools with the great
potential. They can help people to understand the patterns in certain
chunk of information so it is obvious that the data mining tools have
a wide area of applications. For example in the theoretical chemistry
data mining tools can be used to predict moleculeproperties or
improve computer-assisted drug design. Classification analysis is one
of the major data mining methodologies. The aim of thecontribution
is to create a classification model, which would be able to deal with a
huge data set with high accuracy. For this purpose logistic regression,
Bayesian logistic regression and random forest models were built
using R software. TheBayesian logistic regression in Latent GOLD
software was created as well. These classification methods belong to
supervised learning methods.
It was necessary to reduce data matrix dimension before construct
models and thus the factor analysis (FA) was used. Those models
were applied to predict the biological activity of molecules, potential
new drug candidates.