Abstract: Big Data represents the recent technology of manipulating voluminous and unstructured data sets over multiple sources. Therefore, NOSQL appears to handle the problem of unstructured data. Association rules mining is one of the popular techniques of data mining to extract hidden relationship from transactional databases. The algorithm for finding association dependencies is well-solved with Map Reduce. The goal of our work is to reduce the time of generating of frequent itemsets by using Map Reduce and NOSQL database oriented document. A comparative study is given to evaluate the performances of our algorithm with the classical algorithm Apriori.
Abstract: This paper explores efficient ways to implement various
media-updating features like news aggregation, video conversion,
and bulk email handling. All of these jobs share the property
that they are periodic in nature, and they all benefit from being
handled in a distributed fashion. The data for these jobs also often
comes from a social or collaborative source. We isolate the class of
periodic, one round map reduce jobs as a useful setting to describe
and handle media updating tasks. As such tasks are simpler than
general map reduce jobs, programming them in a general map
reduce platform could easily become tedious. This paper presents
a MediaUpdater module of the Yioop Open Source Search Engine
Web Portal designed to handle such jobs via an extension of a
PHP class. We describe how to implement various media-updating
tasks in our system as well as experiments carried out using these
implementations on an Amazon Web Services cluster.
Abstract: The system for analyzing and eliciting public
grievances serves its main purpose to receive and process all sorts of
complaints from the public and respond to users. Due to the more
number of complaint data becomes big data which is difficult to store
and process. The proposed system uses HDFS to store the big data
and uses MapReduce to process the big data. The concept of cache
was applied in the system to provide immediate response and timely
action using big data analytics. Cache enabled big data increases the
response time of the system. The unstructured data provided by the
users are efficiently handled through map reduce algorithm. The
processing of complaints takes place in the order of the hierarchy of
the authority. The drawbacks of the traditional database system used
in the existing system are set forth by our system by using Cache
enabled Hadoop Distributed File System. MapReduce framework
codes have the possible to leak the sensitive data through
computation process. We propose a system that add noise to the
output of the reduce phase to avoid signaling the presence of
sensitive data. If the complaints are not processed in the ample time,
then automatically it is forwarded to the higher authority. Hence it
ensures assurance in processing. A copy of the filed complaint is sent
as a digitally signed PDF document to the user mail id which serves
as a proof. The system report serves to be an essential data while
making important decisions based on legislation.