Abstract: Effective treatment of ground instability is essential when managing the impacts associated with historic mining. A field trial was undertaken by the Coal Authority to investigate the geotechnical performance and potential use of composite materials comprising resin and fill or stone to safely treat surface collapses, such as crown-holes, associated with shallow mining. Test pits were loosely filled with various granular fill materials. The fill material was injected with commercially available silicate and polyurethane resin foam products. In situ and laboratory testing was undertaken to assess the geotechnical properties of the resultant composite materials. The test pits were subsequently excavated to assess resin permeation. Drilling and resin injection was easiest through clean limestone fill materials. Recycled building waste fill material proved difficult to inject with resin; this material is thus considered unsuitable for use in resin composites. Incomplete resin permeation in several of the test pits created irregular ‘blocks’ of composite. Injected resin foams significantly improve the stiffness and resistance (strength) of the un-compacted fill material. The stiffness of the treated fill material appears to be a function of the stone particle size, its associated compaction characteristics (under loose tipping) and the proportion of resin foam matrix. The type of fill material is more critical than the type of resin to the geotechnical properties of the composite materials. Resin composites can effectively support typical design imposed loads. Compared to other traditional treatment options, such as cement grouting, the use of resin composites is potentially less disruptive, particularly for sites with limited access, and thus likely to achieve significant reinstatement cost savings. The use of resin composites is considered a suitable option for the future treatment of shallow mining collapses.
Abstract: Web Usage Mining is the application of data mining
techniques to find usage patterns from web log data, so as to grasp
required patterns and serve the requirements of Web-based
applications. User’s expertise on the internet may be improved by
minimizing user’s web access latency. This may be done by
predicting the future search page earlier and the same may be prefetched
and cached. Therefore, to enhance the standard of web
services, it is needed topic to research the user web navigation
behavior. Analysis of user’s web navigation behavior is achieved
through modeling web navigation history. We propose this technique
which cluster’s the user sessions, based on the K-medoids technique.
Abstract: This paper is meant to analyze the ranking of
University of Malaysia Terengganu, UMT’s website in the World
Wide Web. There are only few researches have been done on
comparing the ranking of universities’ websites so this research will
be able to determine whether the existing UMT’s website is serving
its purpose which is to introduce UMT to the world. The ranking is
based on hub and authority values which are accordance to the
structure of the website. These values are computed using two websearching
algorithms, HITS and SALSA. Three other universities’
websites are used as the benchmarks which are UM, Harvard and
Stanford. The result is clearly showing that more work has to be done
on the existing UMT’s website where important pages according to
the benchmarks, do not exist in UMT’s pages. The ranking of UMT’s
website will act as a guideline for the web-developer to develop a
more efficient website.
Abstract: Due to the increasing efforts on saving our natural
environment a change in the structure of energy resources can be
observed - an increasing fraction of a renewable energy sources.
In many countries traditional underground coal mining loses its
significance but there are still countries, like Poland or Germany, in
which the coal based technologies have the greatest fraction in a total
energy production. This necessitates to make an effort to limit the
costs and negative effects of underground coal mining. The longwall
complex is as essential part of the underground coal mining. The
safety and the effectiveness of the work is strongly dependent of the
diagnostic state of powered roof supports.
The building of a useful and reliable diagnostic system requires
a lot of data. As the acquisition of a data of any possible operating
conditions it is important to have a possibility to generate a demanded
artificial working characteristics. In this paper a new approach of
modelling a leg pressure in the single unit of powered roof support.
The model is a result of the analysis of a typical working cycles.
Abstract: The exponential growth of social media arouses much
attention on public opinion information. The online forums, blogs,
micro blogs are proving to be extremely valuable resources and are
having bulk volume of information. However, most of the social
media data is unstructured and semi structured form. So that it is
more difficult to decipher automatically. Therefore, it is very much
essential to understand and analyze those data for making a right
decision. The online forums hotspot detection is a promising research
field in the web mining and it guides to motivate the user to take right
decision in right time. The proposed system consist of a novel
approach to detect a hotspot forum for any given time period. It uses
aging theory to find the hot terms and E-K-means for detecting the
hotspot forum. Experimental results demonstrate that the proposed
approach outperforms k-means for detecting the hotspot forums with
the improved accuracy.
Abstract: Despite the highly touted benefits, emerging
technologies have unleashed pervasive concerns regarding unintended
and unforeseen social impacts. Thus, those wishing to create safe and
socially acceptable products need to identify such side effects and
mitigate them prior to the market proliferation. Various methodologies
in the field of technology assessment (TA), namely Delphi, impact
assessment, and scenario planning, have been widely incorporated in
such a circumstance. However, literatures face a major limitation in
terms of sole reliance on participatory workshop activities. They
unfortunately missed out the availability of a massive untapped data
source of futuristic information flooding through the Internet. This
research thus seeks to gain insights into utilization of futuristic data,
future-oriented documents from the Internet, as a supplementary
method to generate social impact scenarios whilst capturing
perspectives of experts from a wide variety of disciplines. To this end,
network analysis is conducted based on the social keywords extracted
from the futuristic documents by text mining, which is then used as a
guide to produce a comprehensive set of detailed scenarios. Our
proposed approach facilitates harmonized depictions of possible
hazardous consequences of emerging technologies and thereby makes
decision makers more aware of, and responsive to, broad qualitative
uncertainties.
Abstract: Due to the rapid increase of Internet, web opinion
sources dynamically emerge which is useful for both potential
customers and product manufacturers for prediction and decision
purposes. These are the user generated contents written in natural
languages and are unstructured-free-texts scheme. Therefore, opinion
mining techniques become popular to automatically process customer
reviews for extracting product features and user opinions expressed
over them. Since customer reviews may contain both opinionated and
factual sentences, a supervised machine learning technique applies
for subjectivity classification to improve the mining performance. In
this paper, we dedicate our work is the task of opinion
summarization. Therefore, product feature and opinion extraction is
critical to opinion summarization, because its effectiveness
significantly affects the identification of semantic relationships. The
polarity and numeric score of all the features are determined by
Senti-WordNet Lexicon. The problem of opinion summarization
refers how to relate the opinion words with respect to a certain
feature. Probabilistic based model of supervised learning will
improve the result that is more flexible and effective.
Abstract: One image is worth more than thousand words.
Images if analyzed can reveal useful information. Low level image
processing deals with the extraction of specific feature from a single
image. Now the question arises: What technique should be used to
extract patterns of very large and detailed image database? The
answer of the question is: “Image Mining”. Image Mining deals with
the extraction of image data relationship, implicit knowledge, and
another pattern from the collection of images or image database. It is
nothing but the extension of Data Mining. In the following paper, not
only we are going to scrutinize the current techniques of image
mining but also present a new technique for mining images using
Genetic Algorithm.
Abstract: Different tools and technologies were implemented
for Crisis Response and Management (CRM) which is generally
using available network infrastructure for information exchange.
Depending on type of disaster or crisis, network infrastructure could
be affected and it could not be able to provide reliable connectivity.
Thus any tool or technology that depends on the connectivity could
not be able to fulfill its functionalities. As a solution, a new message
exchange framework has been developed. Framework provides
offline/online information exchange platform for CRM Information
Systems (CRMIS) and it uses XML compression and packet
prioritization algorithms and is based on open source web
technologies. By introducing offline capabilities to the web
technologies, framework will be able to perform message exchange
on unreliable networks. The experiments done on the simulation
environment provide promising results on low bandwidth networks
(56kbps and 28.8 kbps) with up to 50% packet loss and the solution is
to successfully transfer all the information on these low quality
networks where the traditional 2 and 3 tier applications failed.
Abstract: The use of eXtensible Markup Language (XML) in
web, business and scientific databases lead to the development of
methods, techniques and systems to manage and analyze XML data.
Semi-structured documents suffer due to its heterogeneity and
dimensionality. XML structure and content mining represent
convergence for research in semi-structured data and text mining. As
the information available on the internet grows drastically, extracting
knowledge from XML documents becomes a harder task. Certainly,
documents are often so large that the data set returned as answer to a
query may also be very big to convey the required information. To
improve the query answering, a Semantic Tree Based Association
Rule (STAR) mining method is proposed. This method provides
intentional information by considering the structure, content and the
semantics of the content. The method is applied on Reuter’s dataset
and the results show that the proposed method outperforms well.
Abstract: Existing methods of data mining cannot be applied on
spatial data because they require spatial specificity consideration, as
spatial relationships.
This paper focuses on the classification with decision trees, which
are one of the data mining techniques. We propose an extension of
the C4.5 algorithm for spatial data, based on two different approaches
Join materialization and Querying on the fly the different tables.
Similar works have been done on these two main approaches, the
first - Join materialization - favors the processing time in spite of
memory space, whereas the second - Querying on the fly different
tables- promotes memory space despite of the processing time.
The modified C4.5 algorithm requires three entries tables: a target
table, a neighbor table, and a spatial index join that contains the
possible spatial relationship among the objects in the target table and
those in the neighbor table. Thus, the proposed algorithms are applied
to a spatial data pattern in the accidentology domain.
A comparative study of our approach with other works of
classification by spatial decision trees will be detailed.
Abstract: Nowadays, the Web has become one of the most
pervasive platforms for information change and retrieval. It collects
the suitable and perfectly fitting information from websites that one
requires. Data mining is the form of extracting data’s available in the
internet. Web mining is one of the elements of data mining
Technique, which relates to various research communities such as
information recovery, folder managing system and simulated
intellects. In this Paper we have discussed the concepts of Web
mining. We contain generally focused on one of the categories of
Web mining, specifically the Web Content Mining and its various
farm duties. The mining tools are imperative to scanning the many
images, text, and HTML documents and then, the result is used by
the various search engines. We conclude by presenting a comparative
table of these tools based on some pertinent criteria.
Abstract: GRF, Growth regulating factor, genes encode a novel
class of plant-specific transcription factors. The GRF proteins play a
role in the regulation of cell numbers in young and growing tissues
and may act as transcription activations in growth and development
of plants. Identification of GRF genes and their expression are
important in plants to performance of the growth and development of
various organs. In this study, to better understanding the structural
and functional differences of GRFs family, 45 GRF proteins
sequences in A. thaliana, Z. mays, O. sativa, B. napus, B. rapa, H.
vulgare and S. bicolor, have been collected and analyzed through
bioinformatics data mining. As a result, in secondary structure of
GRFs, the number of alpha helices was more than beta sheets and in
all of them QLQ domains were completely in the biggest alpha helix.
In all GRFs, QLQ and WRC domains were completely protected
except in AtGRF9. These proteins have no trans-membrane domain
and due to have nuclear localization signals act in nuclear and they
are component of unstable proteins in the test tube.
Abstract: Frequent pattern mining is the process of finding a
pattern (a set of items, subsequences, substructures, etc.) that occurs
frequently in a data set. It was proposed in the context of frequent
itemsets and association rule mining. Frequent pattern mining is used
to find inherent regularities in data. What products were often
purchased together? Its applications include basket data analysis,
cross-marketing, catalog design, sale campaign analysis, Web log
(click stream) analysis, and DNA sequence analysis. However, one of
the bottlenecks of frequent itemset mining is that as the data increase
the amount of time and resources required to mining the data
increases at an exponential rate. In this investigation a new algorithm
is proposed which can be uses as a pre-processor for frequent itemset
mining. FASTER (FeAture SelecTion using Entropy and Rough sets)
is a hybrid pre-processor algorithm which utilizes entropy and roughsets
to carry out record reduction and feature (attribute) selection
respectively. FASTER for frequent itemset mining can produce a
speed up of 3.1 times when compared to original algorithm while
maintaining an accuracy of 71%.
Abstract: There have been a lot of efforts and researches undertaken in developing efficient tools for performing several tasks in data mining. Due to the massive amount of information embedded in huge data warehouses maintained in several domains, the extraction of meaningful pattern is no longer feasible. This issue turns to be more obligatory for developing several tools in data mining. Furthermore the major aspire of data mining software is to build a resourceful predictive or descriptive model for handling large amount of information more efficiently and user friendly. Data mining mainly contracts with excessive collection of data that inflicts huge rigorous computational constraints. These out coming challenges lead to the emergence of powerful data mining technologies. In this survey a diverse collection of data mining tools are exemplified and also contrasted with the salient features and performance behavior of each tool.
Abstract: This paper investigates a new data mining capability that entails mining of High Utility Itemsets (HUI) in a distributed environment. Existing research in data mining deals with only presence or absence of an items and do not consider the semantic measures like weight or cost of the items. Thus, HUI mining algorithm has evolved. HUI mining is the one kind of utility mining concept, aims to identify itemsets whose utility satisfies a given threshold. Although, the approach of mining HUIs in a distributed environment and mining of the same from XML data have not explored yet. In this work, a novel approach is proposed to mine HUIs from the XML based data in a distributed environment. This work utilizes Service Oriented Computing (SOC) paradigm which provides Knowledge as a Service (KaaS). The interesting patterns are provided via the web services with the help of knowledge server to answer the queries of the consumers. The performance of the approach is evaluated on various databases using execution time and memory consumption.
Abstract: The continuous growth in the size of the World Wide Web has resulted in intricate Web sites, demanding enhanced user skills and more sophisticated tools to help the Web user to find the desired information. In order to make Web more user friendly, it is necessary to provide personalized services and recommendations to the Web user. For discovering interesting and frequent navigation patterns from Web server logs many Web usage mining techniques have been applied. The recommendation accuracy of usage based techniques can be improved by integrating Web site content and site structure in the personalization process.
Herein, we propose semantically enriched Web Usage Mining method for Personalization (SWUMP), an extension to solely usage based technique. This approach is a combination of the fields of Web Usage Mining and Semantic Web. In the proposed method, we envisage enriching the undirected graph derived from usage data with rich semantic information extracted from the Web pages and the Web site structure. The experimental results show that the SWUMP generates accurate recommendations and is able to achieve 10-20% better accuracy than the solely usage based model. The SWUMP addresses the new item problem inherent to solely usage based techniques.
Abstract: The high utilization rate of Automated Teller Machine (ATM) has inevitably caused the phenomena of waiting for a long time in the queue. This in turn has increased the out of stock situations. The ATM utilization helps to determine the usage level and states the necessity of the ATM based on the utilization of the ATM system. The time in which the ATM used more frequently (peak time) and based on the predicted solution the necessary actions are taken by the bank management. The analysis can be done by using the concept of Data Mining and the major part are analyzed based on the predictive data mining. The results are predicted from the historical data (past data) and track the relevant solution which is required. Weka tool is used for the analysis of data based on predictive data mining.
Abstract: Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modeling and testing work was done in R and Greenplum in-database analytic tools.
Abstract: The application of data mining to environmental monitoring has become crucial for a number of tasks related to emergency management. Over recent years, many tools have been developed for decision support system (DSS) for emergency management. In this article a graphical user interface (GUI) for environmental monitoring system is presented. This interface allows accomplishing (i) data collection and observation and (ii) extraction for data mining. This tool may be the basis for future development along the line of the open source software paradigm.