A Real-Time Monitoring System of the Supply Chain Conditions, Products and Means of Transport

Real-time monitoring of the supply chain conditions and procedures is a critical element for the optimal coordination and safety of the deliveries, as well as for the minimization of the delivery time and cost. Real time monitoring requires IoT data streams, which are related to the conditions of the products and the means of transport (e.g., location, temperature/humidity conditions, kinematic state, ambient light conditions, etc.). These streams are generated by battery-based IoT tracking devices, equipped with appropriate sensors, and are transmitted to a cloud-based back-end system. Proper handling and processing of the IoT data streams, using predictive and artificial intelligence algorithms, can provide significant and useful results, which can be exploited by the supply chain stakeholders in order to enhance their financial benefits, as well as the efficiency, security, transparency, coordination and sustainability of the supply chain procedures. The technology, the features and the characteristics of a complete, proprietary system, including hardware, firmware and software tools - developed in the context of a co-funded R&D program - are addressed and presented in this paper. 

Decision-Making Strategies on Smart Dairy Farms: A Review

Farm management and operations will drastically change due to access to real-time data, real-time forecasting and tracking of physical items in combination with Internet of Things (IoT) developments to further automate farm operations. Dairy farms have embraced technological innovations and procured vast amounts of permanent data streams during the past decade; however, the integration of this information to improve the whole farm decision-making process does not exist. It is now imperative to develop a system that can collect, integrate, manage, and analyze on-farm and off-farm data in real-time for practical and relevant environmental and economic actions. The developed systems, based on machine learning and artificial intelligence, need to be connected for useful output, a better understanding of the whole farming issue and environmental impact. Evolutionary Computing (EC) can be very effective in finding the optimal combination of sets of some objects and finally, in strategy determination. The system of the future should be able to manage the dairy farm as well as an experienced dairy farm manager with a team of the best agricultural advisors. All these changes should bring resilience and sustainability to dairy farming as well as improving and maintaining good animal welfare and the quality of dairy products. This review aims to provide an insight into the state-of-the-art of big data applications and EC in relation to smart dairy farming and identify the most important research and development challenges to be addressed in the future. Smart dairy farming influences every area of management and its uptake has become a continuing trend.

Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data

In recent years, real-time spatial applications, like location-aware services and traffic monitoring, have become more and more important. Such applications result dynamic environments where data as well as queries are continuously moving. As a result, there is a tremendous amount of real-time spatial data generated every day. The growth of the data volume seems to outspeed the advance of our computing infrastructure. For instance, in real-time spatial Big Data, users expect to receive the results of each query within a short time period without holding in account the load of the system. But with a huge amount of real-time spatial data generated, the system performance degrades rapidly especially in overload situations. To solve this problem, we propose the use of data partitioning as an optimization technique. Traditional horizontal and vertical partitioning can increase the performance of the system and simplify data management. But they remain insufficient for real-time spatial Big data; they can’t deal with real-time and stream queries efficiently. Thus, in this paper, we propose a novel data partitioning approach for real-time spatial Big data named VPA-RTSBD (Vertical Partitioning Approach for Real-Time Spatial Big data). This contribution is an implementation of the Matching algorithm for traditional vertical partitioning. We find, firstly, the optimal attribute sequence by the use of Matching algorithm. Then, we propose a new cost model used for database partitioning, for keeping the data amount of each partition more balanced limit and for providing a parallel execution guarantees for the most frequent queries. VPA-RTSBD aims to obtain a real-time partitioning scheme and deals with stream data. It improves the performance of query execution by maximizing the degree of parallel execution. This affects QoS (Quality Of Service) improvement in real-time spatial Big Data especially with a huge volume of stream data. The performance of our contribution is evaluated via simulation experiments. The results show that the proposed algorithm is both efficient and scalable, and that it outperforms comparable algorithms.

Weighted-Distance Sliding Windows and Cooccurrence Graphs for Supporting Entity-Relationship Discovery in Unstructured Text

The problem of Entity relation discovery in structured data, a well covered topic in literature, consists in searching within unstructured sources (typically, text) in order to find connections among entities. These can be a whole dictionary, or a specific collection of named items. In many cases machine learning and/or text mining techniques are used for this goal. These approaches might be unfeasible in computationally challenging problems, such as processing massive data streams. A faster approach consists in collecting the cooccurrences of any two words (entities) in order to create a graph of relations - a cooccurrence graph. Indeed each cooccurrence highlights some grade of semantic correlation between the words because it is more common to have related words close each other than having them in the opposite sides of the text. Some authors have used sliding windows for such problem: they count all the occurrences within a sliding windows running over the whole text. In this paper we generalise such technique, coming up to a Weighted-Distance Sliding Window, where each occurrence of two named items within the window is accounted with a weight depending on the distance between items: a closer distance implies a stronger evidence of a relationship. We develop an experiment in order to support this intuition, by applying this technique to a data set consisting in the text of the Bible, split into verses.

Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Real-Time Image Encryption Using a 3D Discrete Dual Chaotic Cipher

In this paper, an encryption algorithm is proposed for real-time image encryption. The scheme employs a dual chaotic generator based on a three dimensional (3D) discrete Lorenz attractor. Encryption is achieved using non-autonomous modulation where the data is injected into the dynamics of the master chaotic generator. The second generator is used to permute the dynamics of the master generator using the same approach. Since the data stream can be regarded as a random source, the resulting permutations of the generator dynamics greatly increase the security of the transmitted signal. In addition, a technique is proposed to mitigate the error propagation due to the finite precision arithmetic of digital hardware. In particular, truncation and rounding errors are eliminated by employing an integer representation of the data which can easily be implemented. The simple hardware architecture of the algorithm makes it suitable for secure real-time applications.

Mining Big Data in Telecommunications Industry: Challenges, Techniques, and Revenue Opportunity

Mining big data represents a big challenge nowadays. Many types of research are concerned with mining massive amounts of data and big data streams. Mining big data faces a lot of challenges including scalability, speed, heterogeneity, accuracy, provenance and privacy. In telecommunication industry, mining big data is like a mining for gold; it represents a big opportunity and maximizing the revenue streams in this industry. This paper discusses the characteristics of big data (volume, variety, velocity and veracity), data mining techniques and tools for handling very large data sets, mining big data in telecommunication and the benefits and opportunities gained from them.

On the Optimality Assessment of Nanoparticle Size Spectrometry and Its Association to the Entropy Concept

Particle size distribution, the most important characteristics of aerosols, is obtained through electrical characterization techniques. The dynamics of charged nanoparticles under the influence of electric field in Electrical Mobility Spectrometer (EMS) reveals the size distribution of these particles. The accuracy of this measurement is influenced by flow conditions, geometry, electric field and particle charging process, therefore by the transfer function (transfer matrix) of the instrument. In this work, a wire-cylinder corona charger was designed and the combined fielddiffusion charging process of injected poly-disperse aerosol particles was numerically simulated as a prerequisite for the study of a multichannel EMS. The result, a cloud of particles with no uniform charge distribution, was introduced to the EMS. The flow pattern and electric field in the EMS were simulated using Computational Fluid Dynamics (CFD) to obtain particle trajectories in the device and therefore to calculate the reported signal by each electrometer. According to the output signals (resulted from bombardment of particles and transferring their charges as currents), we proposed a modification to the size of detecting rings (which are connected to electrometers) in order to evaluate particle size distributions more accurately. Based on the capability of the system to transfer information contents about size distribution of the injected particles, we proposed a benchmark for the assessment of optimality of the design. This method applies the concept of Von Neumann entropy and borrows the definition of entropy from information theory (Shannon entropy) to measure optimality. Entropy, according to the Shannon entropy, is the ''average amount of information contained in an event, sample or character extracted from a data stream''. Evaluating the responses (signals) which were obtained via various configurations of detecting rings, the best configuration which gave the best predictions about the size distributions of injected particles, was the modified configuration. It was also the one that had the maximum amount of entropy. A reasonable consistency was also observed between the accuracy of the predictions and the entropy content of each configuration. In this method, entropy is extracted from the transfer matrix of the instrument for each configuration. Ultimately, various clouds of particles were introduced to the simulations and predicted size distributions were compared to the exact size distributions.

Survey on Arabic Sentiment Analysis in Twitter

Large-scale data stream analysis has become one of the important business and research priorities lately. Social networks like Twitter and other micro-blogging platforms hold an enormous amount of data that is large in volume, velocity and variety. Extracting valuable information and trends out of these data would aid in a better understanding and decision-making. Multiple analysis techniques are deployed for English content. Moreover, one of the languages that produce a large amount of data over social networks and is least analyzed is the Arabic language. The proposed paper is a survey on the research efforts to analyze the Arabic content in Twitter focusing on the tools and methods used to extract the sentiments for the Arabic content on Twitter.

Frequent and Systematic Timing Enhancement of Congestion Window in Typical Transmission Control Protocol

Transmission Control Protocol (TCP) among the wired and wireless networks, it still has a practical problem; where the congestion control mechanism does not permit the data stream to get complete bandwidth over the existing network links. To solve this problem, many TCP protocols have been introduced with high speed performance. Therefore, an enhanced congestion window (cwnd) for the congestion control mechanism is proposed in this article to improve the performance of TCP by increasing the number of cycles of the new window to improve the transmitted packet number. The proposed algorithm used a new mechanism based on the available bandwidth of the connection to detect the capacity of network path in order to improve the regular clocking of congestion avoidance mechanism. The work in this paper based on using Network Simulator 2 (NS-2) to simulate the proposed algorithm.

A Content-Based Optimization of Data Stream Television Multiplex

The television multiplex has reserved capacity and therefore we can use only limited number of videos for propagation of it. Appropriate composition of the multiplex has a major impact on how many videos is spread by multiplex. Therefore in this paper is designed a simple algorithm to optimize capacity utilization multiplex. Significant impact on the number of programs in the multiplex has also the fact from which programs is composed. Content of multiplex can be movies, news, sport, animated stories, documentaries, etc. These types have their own specific characteristics that affect their resulting data stream. In this paper is also done an impact analysis of the composition of the multiplex to use its capacity by video content. 

Semantic Support for Hypothesis-Based Research from Smart Environment Monitoring and Analysis Technologies

Improvements in the data fusion and data analysis phase of research are imperative due to the exponential growth of sensed data. Currently, there are developments in the Semantic Sensor Web community to explore efficient methods for reuse, correlation and integration of web-based data sets and live data streams. This paper describes the integration of remotely sensed data with web-available static data for use in observational hypothesis testing and the analysis phase of research. The Semantic Reef system combines semantic technologies (e.g., well-defined ontologies and logic systems) with scientific workflows to enable hypothesis-based research. A framework is presented for how the data fusion concepts from the Semantic Reef architecture map to the Smart Environment Monitoring and Analysis Technologies (SEMAT) intelligent sensor network initiative. The data collected via SEMAT and the inferred knowledge from the Semantic Reef system are ingested to the Tropical Data Hub for data discovery, reuse, curation and publication.

SUPAR: System for User-Centric Profiling of Association Rules in Streaming Data

With a surge of stream processing applications novel techniques are required for generation and analysis of association rules in streams. The traditional rule mining solutions cannot handle streams because they generally require multiple passes over the data and do not guarantee the results in a predictable, small time. Though researchers have been proposing algorithms for generation of rules from streams, there has not been much focus on their analysis. We propose Association rule profiling, a user centric process for analyzing association rules and attaching suitable profiles to them depending on their changing frequency behavior over a previous snapshot of time in a data stream. Association rule profiles provide insights into the changing nature of associations and can be used to characterize the associations. We discuss importance of characteristics such as predictability of linkages present in the data and propose metric to quantify it. We also show how association rule profiles can aid in generation of user specific, more understandable and actionable rules. The framework is implemented as SUPAR: System for Usercentric Profiling of Association Rules in streaming data. The proposed system offers following capabilities: i) Continuous monitoring of frequency of streaming item-sets and detection of significant changes therein for association rule profiling. ii) Computation of metrics for quantifying predictability of associations present in the data. iii) User-centric control of the characterization process: user can control the framework through a) constraint specification and b) non-interesting rule elimination.

Cumulative Learning based on Dynamic Clustering of Hierarchical Production Rules(HPRs)

An important structuring mechanism for knowledge bases is building clusters based on the content of their knowledge objects. The objects are clustered based on the principle of maximizing the intraclass similarity and minimizing the interclass similarity. Clustering can also facilitate taxonomy formation, that is, the organization of observations into a hierarchy of classes that group similar events together. Hierarchical representation allows us to easily manage the complexity of knowledge, to view the knowledge at different levels of details, and to focus our attention on the interesting aspects only. One of such efficient and easy to understand systems is Hierarchical Production rule (HPRs) system. A HPR, a standard production rule augmented with generality and specificity information, is of the following form Decision If < condition> Generality Specificity . HPRs systems are capable of handling taxonomical structures inherent in the knowledge about the real world. In this paper, a set of related HPRs is called a cluster and is represented by a HPR-tree. This paper discusses an algorithm based on cumulative learning scenario for dynamic structuring of clusters. The proposed scheme incrementally incorporates new knowledge into the set of clusters from the previous episodes and also maintains summary of clusters as Synopsis to be used in the future episodes. Examples are given to demonstrate the behaviour of the proposed scheme. The suggested incremental structuring of clusters would be useful in mining data streams.

Design of Buffer Management for Industry to Avoid Sensor Data- Conflicts

To reduce accidents in the industry, WSNs(Wireless Sensor networks)- sensor data is used. WSNs- sensor data has the persistence and continuity. therefore, we design and exploit the buffer management system that has the persistence and continuity to avoid and delivery data conflicts. To develop modules, we use the multi buffers and design the buffer management modules that transfer sensor data through the context-aware methods.

Unsupervised Outlier Detection in Streaming Data Using Weighted Clustering

Outlier detection in streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving. Irrelevant attributes can be termed as noisy attributes and such attributes further magnify the challenge of working with data streams. In this paper, we propose an unsupervised outlier detection scheme for streaming data. This scheme is based on clustering as clustering is an unsupervised data mining task and it does not require labeled data, both density based and partitioning clustering are combined for outlier detection. In this scheme partitioning clustering is also used to assign weights to attributes depending upon their respective relevance and weights are adaptive. Weighted attributes are helpful to reduce or remove the effect of noisy attributes. Keeping in view the challenges of streaming data, the proposed scheme is incremental and adaptive to concept evolution. Experimental results on synthetic and real world data sets show that our proposed approach outperforms other existing approach (CORM) in terms of outlier detection rate, false alarm rate, and increasing percentages of outliers.

ASC – A Stream Cipher with Built – In MAC Functionality

In this paper we present the design of a new encryption scheme. The scheme we propose is a very exible encryption and authentication primitive. We build this scheme on two relatively new design principles: t-functions and fast pseudo hadamard transforms. We recapitulate the theory behind these principles and analyze their security properties and efficiency. In more detail we propose a streamcipher which outputs a message authentication tag along with theencrypted data stream with only little overhead. Moreover we proposesecurity-speed tradeoffs. Our scheme is faster than other comparablet-function based designs while offering the same security level.

A Keyword-Based Filtering Technique of Document-Centric XML using NFA Representation

XML is becoming a de facto standard for online data exchange. Existing XML filtering techniques based on a publish/subscribe model are focused on the highly structured data marked up with XML tags. These techniques are efficient in filtering the documents of data-centric XML but are not effective in filtering the element contents of the document-centric XML. In this paper, we propose an extended XPath specification which includes a special matching character '%' used in the LIKE operation of SQL in order to solve the difficulty of writing some queries to adequately filter element contents using the previous XPath specification. We also present a novel technique for filtering a collection of document-centric XMLs, called Pfilter, which is able to exploit the extended XPath specification. We show several performance studies, efficiency and scalability using the multi-query processing time (MQPT).

Optimal Convolutive Filters for Real-Time Detection and Arrival Time Estimation of Transient Signals

Linear convolutive filters are fast in calculation and in application, and thus, often used for real-time processing of continuous data streams. In the case of transient signals, a filter has not only to detect the presence of a specific waveform, but to estimate its arrival time as well. In this study, a measure is presented which indicates the performance of detectors in achieving both of these tasks simultaneously. Furthermore, a new sub-class of linear filters within the class of filters which minimize the quadratic response is proposed. The proposed filters are more flexible than the existing ones, like the adaptive matched filter or the minimum power distortionless response beamformer, and prove to be superior with respect to that measure in certain settings. Simulations of a real-time scenario confirm the advantage of these filters as well as the usefulness of the performance measure.

Fault Localization and Alarm Correlation in Optical WDM Networks

For several high speed networks, providing resilience against failures is an essential requirement. The main feature for designing next generation optical networks is protecting and restoring high capacity WDM networks from the failures. Quick detection, identification and restoration make networks more strong and consistent even though the failures cannot be avoided. Hence, it is necessary to develop fast, efficient and dependable fault localization or detection mechanisms. In this paper we propose a new fault localization algorithm for WDM networks which can identify the location of a failure on a failed lightpath. Our algorithm detects the failed connection and then attempts to reroute data stream through an alternate path. In addition to this, we develop an algorithm to analyze the information of the alarms generated by the components of an optical network, in the presence of a fault. It uses the alarm correlation in order to reduce the list of suspected components shown to the network operators. By our simulation results, we show that our proposed algorithms achieve less blocking probability and delay while getting higher throughput.