A Survey on Data-Centric and Data-Aware Techniques for Large Scale Infrastructures

Large scale computing infrastructures have been widely developed with the core objective of providing a suitable platform for high-performance and high-throughput computing. These systems are designed to support resource-intensive and complex applications, which can be found in many scientific and industrial areas. Currently, large scale data-intensive applications are hindered by the high latencies that result from the access to vastly distributed data. Recent works have suggested that improving data locality is key to move towards exascale infrastructures efficiently, as solutions to this problem aim to reduce the bandwidth consumed in data transfers, and the overheads that arise from them. There are several techniques that attempt to move computations closer to the data. In this survey we analyse the different mechanisms that have been proposed to provide data locality for large scale high-performance and high-throughput systems. This survey intends to assist scientific computing community in understanding the various technical aspects and strategies that have been reported in recent literature regarding data locality. As a result, we present an overview of locality-oriented techniques, which are grouped in four main categories: application development, task scheduling, in-memory computing and storage platforms. Finally, the authors include a discussion on future research lines and synergies among the former techniques.

A Keyword-Based Filtering Technique of Document-Centric XML using NFA Representation

XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).

Cooperative Energy Efficient Routing for Wireless Sensor Networks in Smart Grid Communications

Smart Grids employ wireless sensor networks for their control and monitoring. Sensors are characterized by limitations in the processing power, energy supply and memory spaces, which require a particular attention on the design of routing and data management algorithms. Since most routing algorithms for sensor networks, focus on finding energy efficient paths to prolong the lifetime of sensor networks, the power of sensors on efficient paths depletes quickly, and consequently sensor networks become incapable of monitoring events from some parts of their target areas. In consequence, the design of routing protocols should consider not only energy efficiency paths, but also energy efficient algorithms in general. In this paper we propose an energy efficient routing protocol for wireless sensor networks without the support of any location information system. The reliability and the efficiency of this protocol have been demonstrated by simulation studies where we compare them to the legacy protocols. Our simulation results show that these algorithms scale well with network size and density.

Mining and Visual Management of XML-Based Image Collections

This article describes Uruk, the virtual museum of Iraq that we developed for visual exploration and retrieval of image collections. The system largely exploits the loosely-structured hierarchy of XML documents that provides a useful representation method to store semi-structured or unstructured data, which does not easily fit into existing database. The system offers users the capability to mine and manage the XML-based image collections through a web-based Graphical User Interface (GUI). Typically, at an interactive session with the system, the user can browse a visual structural summary of the XML database in order to select interesting elements. Using this intermediate result, queries combining structure and textual references can be composed and presented to the system. After query evaluation, the full set of answers is presented in a visual and structured way.

Loop-free Local Path Repair Strategy for Directed Diffusion

This paper proposes an implementation for the directed diffusion paradigm aids in studying this paradigm-s operations and evaluates its behavior according to this implementation. The directed diffusion is evaluated with respect to the loss percentage, lifetime, end-to-end delay, and throughput. From these evaluations some suggestions and modifications are proposed to improve the directed diffusion behavior according to this implementation with respect to these metrics. The proposed modifications reflect the effect of local path repair by introducing a technique called Loop-free Local Path Repair (LLPR) which improves the directed diffusion behavior especially with respect to packet loss percentage by about 92.69%. Also LLPR improves the throughput and end-to-end delay by about 55.31% and 14.06% respectively, while the lifetime decreases by about 29.79%.

Simulations of Routing Protocols of Wireless Sensor Networks

Wireless Sensor Network is widely used in electronics. Wireless sensor networks are now used in many applications including military, environmental, healthcare applications, home automation and traffic control. We will study one area of wireless sensor networks, which is the routing protocol. Routing protocols are needed to send data between sensor nodes and the base station. In this paper, we will discuss two routing protocols, such as datacentric and hierarchical routing protocol. We will show the output of the protocols using the NS-2 simulator. This paper will compare the simulation output of the two routing protocol using Nam. We will simulate using Xgraph to find the throughput and delay of the protocol.

A Review of Coverage and Routing for Wireless Sensor Networks

The special constraints of sensor networks impose a number of technical challenges for employing them. In this review, we study the issues and existing protocols in three areas: coverage and routing. We present two types of coverage problems: to determine the minimum number of sensor nodes that need to perform active sensing in order to monitor a certain area; and to decide the quality of service that can be provided by a given sensor network. While most routing protocols in sensor networks are data-centric, there are other types of routing protocols as well, such as hierarchical, location-based, and QoS-aware. We describe and compare several protocols in each group. We present several multipath routing protocols and single-path with local repair routing protocols, which are proposed for recovering from sensor node crashes. We also discuss some transport layer schemes for reliable data transmission in lossy wireless channels.