Abstract: Big data applications have become an imperative for many fields. Many researchers have been devoted into increasing correct rates and reducing time complexities. Hence, the study designs and proposes an Ontology-based backpropagation neural network classification and reasoning strategy for NoSQL big data applications, which is called ON4NoSQL. ON4NoSQL is responsible for enhancing the performances of classifications in NoSQL and SQL databases to build up mass behavior models. Mass behavior models are made by MapReduce techniques and Hadoop distributed file system based on Hadoop service platform. The reference engine of ON4NoSQL is the ontology-based backpropagation neural network classification and reasoning strategy. Simulation results indicate that ON4NoSQL can efficiently achieve to construct a high performance environment for data storing, searching, and retrieving.
Abstract: The system for analyzing and eliciting public
grievances serves its main purpose to receive and process all sorts of
complaints from the public and respond to users. Due to the more
number of complaint data becomes big data which is difficult to store
and process. The proposed system uses HDFS to store the big data
and uses MapReduce to process the big data. The concept of cache
was applied in the system to provide immediate response and timely
action using big data analytics. Cache enabled big data increases the
response time of the system. The unstructured data provided by the
users are efficiently handled through map reduce algorithm. The
processing of complaints takes place in the order of the hierarchy of
the authority. The drawbacks of the traditional database system used
in the existing system are set forth by our system by using Cache
enabled Hadoop Distributed File System. MapReduce framework
codes have the possible to leak the sensitive data through
computation process. We propose a system that add noise to the
output of the reduce phase to avoid signaling the presence of
sensitive data. If the complaints are not processed in the ample time,
then automatically it is forwarded to the higher authority. Hence it
ensures assurance in processing. A copy of the filed complaint is sent
as a digitally signed PDF document to the user mail id which serves
as a proof. The system report serves to be an essential data while
making important decisions based on legislation.
Abstract: The current Hadoop block placement policy do not fairly and evenly distributes replicas of blocks written to datanodes in a Hadoop cluster.
This paper presents a new solution that helps to keep the cluster in a balanced state while an HDFS client is writing data to a file in Hadoop cluster. The solution had been implemented, and test had been conducted to evaluate its contribution to Hadoop distributed file system.
It has been found that, the solution has lowered global execution time taken by Hadoop balancer to 22 percent. It also has been found that, Hadoop balancer respectively over replicate 1.75 and 3.3 percent of all re-distributed blocks in the modified and original Hadoop clusters.
The feature that keeps the cluster in a balanced state works as a core part to Hadoop system and not just as a utility like traditional balancer. This is one of the significant achievements and uniqueness of the solution developed during the course of this research work.
Abstract: In this paper various techniques in relation to large-scale systems are presented. At first, explanation of large-scale systems and differences from traditional systems are given. Next, possible specifications and requirements on hardware and software are listed. Finally, examples of large-scale systems are presented.