Digital Forensics Compute Cluster: A High Speed Distributed Computing Capability for Digital Forensics

We have developed a distributed computing capability, Digital Forensics Compute Cluster (DFORC2) to speed up the ingestion and processing of digital evidence that is resident on computer hard drives. DFORC2 parallelizes evidence ingestion and file processing steps. It can be run on a standalone computer cluster or in the Amazon Web Services (AWS) cloud. When running in a virtualized computing environment, its cluster resources can be dynamically scaled up or down using Kubernetes. DFORC2 is an open source project that uses Autopsy, Apache Spark and Kafka, and other open source software packages. It extends the proven open source digital forensics capabilities of Autopsy to compute clusters and cloud architectures, so digital forensics tasks can be accomplished efficiently by a scalable array of cluster compute nodes. In this paper, we describe DFORC2 and compare it with a standalone version of Autopsy when both are used to process evidence from hard drives of different sizes.

Business-Intelligence Mining of Large Decentralized Multimedia Datasets with a Distributed Multi-Agent System

The rapid generation of high volume and a broad variety of data from the application of new technologies pose challenges for the generation of business-intelligence. Most organizations and business owners need to extract data from multiple sources and apply analytical methods for the purposes of developing their business. Therefore, the recently decentralized data management environment is relying on a distributed computing paradigm. While data are stored in highly distributed systems, the implementation of distributed data-mining techniques is a challenge. The aim of this technique is to gather knowledge from every domain and all the datasets stemming from distributed resources. As agent technologies offer significant contributions for managing the complexity of distributed systems, we consider this for next-generation data-mining processes. To demonstrate agent-based business intelligence operations, we use agent-oriented modeling techniques to develop a new artifact for mining massive datasets.

Performance Evaluation of Task Scheduling Algorithm on LCQ Network

The Scheduling and mapping of tasks on a set of processors is considered as a critical problem in parallel and distributed computing system. This paper deals with the problem of dynamic scheduling on a special type of multiprocessor architecture known as Linear Crossed Cube (LCQ) network. This proposed multiprocessor is a hybrid network which combines the features of both linear types of architectures as well as cube based architectures. Two standard dynamic scheduling schemes namely Minimum Distance Scheduling (MDS) and Two Round Scheduling (TRS) schemes are implemented on the LCQ network. Parallel tasks are mapped and the imbalance of load is evaluated on different set of processors in LCQ network. The simulations results are evaluated and effort is made by means of through analysis of the results to obtain the best solution for the given network in term of load imbalance left and execution time. The other performance matrices like speedup and efficiency are also evaluated with the given dynamic algorithms.

Accelerating Side Channel Analysis with Distributed and Parallelized Processing

Although there is no theoretical weakness in a cryptographic algorithm, Side Channel Analysis can find out some secret data from the physical implementation of a cryptosystem. The analysis is based on extra information such as timing information, power consumption, electromagnetic leaks or even sound which can be exploited to break the system. Differential Power Analysis is one of the most popular analyses, as computing the statistical correlations of the secret keys and power consumptions. It is usually necessary to calculate huge data and takes a long time. It may take several weeks for some devices with countermeasures. We suggest and evaluate the methods to shorten the time to analyze cryptosystems. Our methods include distributed computing and parallelized processing.

Using Cloud Computing for E-Government: Challenges and Benefits

Cloud computing is a style of computing which is formed from the aggregation and development of technologies such as grid computing distributed computing, parallel computing and service-oriented architecture. And its aim is to provide computing, communication and storage resources in a safe environment based on service, as fast as possible, which is virtually provided via Internet platform. Considering that the provided Services in e-government are available via the Internet, thus cloud computing can be used in the implementation of e-government architecture and provide better service with the lowest economic cost using its benefits. In this paper, the Methods of using cloud computing in e-government has been studied and it's been attempted to identify the challenges and benefits of the cloud to get used in the e-government and proposals have been offered to overcome its shortcomings, encourage and partnership of governments and people to use this economical and new technology.

An Efficient Run Time Interface for Heterogeneous Architecture of Large Scale Supercomputing System

In this paper we propose a novel Run Time Interface (RTI) technique to provide an efficient environment for MPI jobs on the heterogeneous architecture of PARAM Padma. It suggests an innovative, unified framework for the job management interface system in parallel and distributed computing. This approach employs proxy scheme. The implementation shows that the proposed RTI is highly scalable and stable. Moreover RTI provides the storage access for the MPI jobs in various operating system platforms and improve the data access performance through high performance C-DAC Parallel File System (C-PFS). The performance of the RTI is evaluated by using the standard HPC benchmark suites and the simulation results show that the proposed RTI gives good performance on large scale supercomputing system.

A Discriminatory Rewarding Mechanism for Sybil Detection with Applications to Tor

This paper presents an economic game for sybil detection in a distributed computing environment. Cost parameters reflecting impacts of different sybil attacks are introduced in the sybil detection game. The optimal strategies for this game in which both sybil and non-sybil identities are expected to participate are devised. A cost sharing economic mechanism called Discriminatory Rewarding Mechanism for Sybil Detection is proposed based on this game. A detective accepts a security deposit from each active agent, negotiates with the agents and offers rewards to the sybils if the latter disclose their identity. The basic objective of the detective is to determine the optimum reward amount for each sybil which will encourage the maximum possible number of sybils to reveal themselves. Maintaining privacy is an important issue for the mechanism since the participants involved in the negotiation are generally reluctant to share their private information. The mechanism has been applied to Tor by introducing a reputation scoring function.

Service-Oriented Architecture for Object- Centric Information Fusion

In many applications there is a broad variety of information relevant to a focal “object" of interest, and the fusion of such heterogeneous data types is desirable for classification and categorization. While these various data types can sometimes be treated as orthogonal (such as the hull number, superstructure color, and speed of an oil tanker), there are instances where the inference and the correlation between quantities can provide improved fusion capabilities (such as the height, weight, and gender of a person). A service-oriented architecture has been designed and prototyped to support the fusion of information for such “object-centric" situations. It is modular, scalable, and flexible, and designed to support new data sources, fusion algorithms, and computational resources without affecting existing services. The architecture is designed to simplify the incorporation of legacy systems, support exact and probabilistic entity disambiguation, recognize and utilize multiple types of uncertainties, and minimize network bandwidth requirements.

A Self-stabilizing Algorithm for Maximum Popular Matching of Strictly Ordered Preference Lists

In this paper, we consider the problem of Popular Matching of strictly ordered preference lists. A Popular Matching is not guaranteed to exist in any network. We propose an IDbased, constant space, self-stabilizing algorithm that converges to a Maximum Popular Matching an optimum solution, if one exist. We show that the algorithm stabilizes in O(n5) moves under any scheduler (daemon).

Independent Spanning Trees on Systems-on-chip Hypercubes Routing

Independent spanning trees (ISTs) provide a number of advantages in data broadcasting. One can cite the use in fault tolerance network protocols for distributed computing and bandwidth. However, the problem of constructing multiple ISTs is considered hard for arbitrary graphs. In this paper we present an efficient algorithm to construct ISTs on hypercubes that requires minimum resources to be performed.

An Images Monitoring System based on Multi-Format Streaming Grid Architecture

This paper proposes a novel multi-format stream grid architecture for real-time image monitoring system. The system, based on a three-tier architecture, includes stream receiving unit, stream processor unit, and presentation unit. It is a distributed computing and a loose coupling architecture. The benefit is the amount of required servers can be adjusted depending on the loading of the image monitoring system. The stream receive unit supports multi capture source devices and multi-format stream compress encoder. Stream processor unit includes three modules; they are stream clipping module, image processing module and image management module. Presentation unit can display image data on several different platforms. We verified the proposed grid architecture with an actual test of image monitoring. We used a fast image matching method with the adjustable parameters for different monitoring situations. Background subtraction method is also implemented in the system. Experimental results showed that the proposed architecture is robust, adaptive, and powerful in the image monitoring system.

A Fast Replica Placement Methodology for Large-scale Distributed Computing Systems

Fine-grained data replication over the Internet allows duplication of frequently accessed data objects, as opposed to entire sites, to certain locations so as to improve the performance of largescale content distribution systems. In a distributed system, agents representing their sites try to maximize their own benefit since they are driven by different goals such as to minimize their communication costs, latency, etc. In this paper, we will use game theoretical techniques and in particular auctions to identify a bidding mechanism that encapsulates the selfishness of the agents, while having a controlling hand over them. In essence, the proposed game theory based mechanism is the study of what happens when independent agents act selfishly and how to control them to maximize the overall performance. A bidding mechanism asks how one can design systems so that agents- selfish behavior results in the desired system-wide goals. Experimental results reveal that this mechanism provides excellent solution quality, while maintaining fast execution time. The comparisons are recorded against some well known techniques such as greedy, branch and bound, game theoretical auctions and genetic algorithms.

Experimental Parallel Architecture for Rendering 3D Model into MPEG-4 Format

This paper will present the initial findings of a research into distributed computer rendering. The goal of the research is to create a distributed computer system capable of rendering a 3D model into an MPEG-4 stream. This paper outlines the initial design, software architecture and hardware setup for the system. Distributed computing means designing and implementing programs that run on two or more interconnected computing systems. Distributed computing is often used to speed up the rendering of graphical imaging. Distributed computing systems are used to generate images for movies, games and simulations. A topic of interest is the application of distributed computing to the MPEG-4 standard. During the course of the research, a distributed system will be created that can render a 3D model into an MPEG-4 stream. It is expected that applying distributed computing principals will speed up rendering, thus improving the usefulness and efficiency of the MPEG-4 standard

Distributed 2-Vertex Connectivity Test of Graphs Using Local Knowledge

The vertex connectivity of a graph is the smallest number of vertices whose deletion separates the graph or makes it trivial. This work is devoted to the problem of vertex connectivity test of graphs in a distributed environment based on a general and a constructive approach. The contribution of this paper is threefold. First, using a preconstructed spanning tree of the considered graph, we present a protocol to test whether a given graph is 2-connected using only local knowledge. Second, we present an encoding of this protocol using graph relabeling systems. The last contribution is the implementation of this protocol in the message passing model. For a given graph G, where M is the number of its edges, N the number of its nodes and Δ is its degree, our algorithms need the following requirements: The first one uses O(Δ×N2) steps and O(Δ×logΔ) bits per node. The second one uses O(Δ×N2) messages, O(N2) time and O(Δ × logΔ) bits per node. Furthermore, the studied network is semi-anonymous: Only the root of the pre-constructed spanning tree needs to be identified.

A Consistency Protocol Multi-Layer for Replicas Management in Large Scale Systems

Large scale systems such as computational Grid is a distributed computing infrastructure that can provide globally available network resources. The evolution of information processing systems in Data Grid is characterized by a strong decentralization of data in several fields whose objective is to ensure the availability and the reliability of the data in the reason to provide a fault tolerance and scalability, which cannot be possible only with the use of the techniques of replication. Unfortunately the use of these techniques has a height cost, because it is necessary to maintain consistency between the distributed data. Nevertheless, to agree to live with certain imperfections can improve the performance of the system by improving competition. In this paper, we propose a multi-layer protocol combining the pessimistic and optimistic approaches conceived for the data consistency maintenance in large scale systems. Our approach is based on a hierarchical representation model with tree layers, whose objective is with double vocation, because it initially makes it possible to reduce response times compared to completely pessimistic approach and it the second time to improve the quality of service compared to an optimistic approach.

Replicating Data Objects in Large-scale Distributed Computing Systems using Extended Vickrey Auction

This paper proposes a novel game theoretical technique to address the problem of data object replication in largescale distributed computing systems. The proposed technique draws inspiration from computational economic theory and employs the extended Vickrey auction. Specifically, players in a non-cooperative environment compete for server-side scarce memory space to replicate data objects so as to minimize the total network object transfer cost, while maintaining object concurrency. Optimization of such a cost in turn leads to load balancing, fault-tolerance and reduced user access time. The method is experimentally evaluated against four well-known techniques from the literature: branch and bound, greedy, bin-packing and genetic algorithms. The experimental results reveal that the proposed approach outperforms the four techniques in both the execution time and solution quality.

PeliGRIFF: A Parallel DEM-DLM/FD Method for DNS of Particulate Flows with Collisions

An original Direct Numerical Simulation (DNS) method to tackle the problem of particulate flows at moderate to high concentration and finite Reynolds number is presented. Our method is built on the framework established by Glowinski and his coworkers [1] in the sense that we use their Distributed Lagrange Multiplier/Fictitious Domain (DLM/FD) formulation and their operator-splitting idea but differs in the treatment of particle collisions. The novelty of our contribution relies on replacing the simple artificial repulsive force based collision model usually employed in the literature by an efficient Discrete Element Method (DEM) granular solver. The use of our DEM solver enables us to consider particles of arbitrary shape (at least convex) and to account for actual contacts, in the sense that particles actually touch each other, in contrast with the simple repulsive force based collision model. We recently upgraded our serial code, GRIFF 1 [2], to full MPI capabilities. Our new code, PeliGRIFF 2, is developed under the framework of the full MPI open source platform PELICANS [3]. The new MPI capabilities of PeliGRIFF open new perspectives in the study of particulate flows and significantly increase the number of particles that can be considered in a full DNS approach: O(100000) in 2D and O(10000) in 3D. Results on the 2D/3D sedimentation/fluidization of isometric polygonal/polyedral particles with collisions are presented.

A Survey of Job Scheduling and Resource Management in Grid Computing

Grid computing is a form of distributed computing that involves coordinating and sharing computational power, data storage and network resources across dynamic and geographically dispersed organizations. Scheduling onto the Grid is NP-complete, so there is no best scheduling algorithm for all grid computing systems. An alternative is to select an appropriate scheduling algorithm to use in a given grid environment because of the characteristics of the tasks, machines and network connectivity. Job and resource scheduling is one of the key research area in grid computing. The goal of scheduling is to achieve highest possible system throughput and to match the application need with the available computing resources. Motivation of the survey is to encourage the amateur researcher in the field of grid computing, so that they can understand easily the concept of scheduling and can contribute in developing more efficient scheduling algorithm. This will benefit interested researchers to carry out further work in this thrust area of research.

Distributed Case Based Reasoning for Intelligent Tutoring System: An Agent Based Student Modeling Paradigm

Online learning with Intelligent Tutoring System (ITS) is becoming very popular where the system models the student-s learning behavior and presents to the student the learning material (content, questions-answers, assignments) accordingly. In today-s distributed computing environment, the tutoring system can take advantage of networking to utilize the model for a student for students from other similar groups. In the present paper we present a methodology where using Case Based Reasoning (CBR), ITS provides student modeling for online learning in a distributed environment with the help of agents. The paper describes the approach, the architecture, and the agent characteristics for such system. This concept can be deployed to develop ITS where the tutor can author and the students can learn locally whereas the ITS can model the students- learning globally in a distributed environment. The advantage of such an approach is that both the learning material (domain knowledge) and student model can be globally distributed thus enhancing the efficiency of ITS with reducing the bandwidth requirement and complexity of the system.