Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Hybrid Structure Learning Approach for Assessing the Phosphate Laundries Impact

Bayesian Network (BN) is one of the most efficient classification methods. It is widely used in several fields (i.e., medical diagnostics, risk analysis, bioinformatics research). The BN is defined as a probabilistic graphical model that represents a formalism for reasoning under uncertainty. This classification method has a high-performance rate in the extraction of new knowledge from data. The construction of this model consists of two phases for structure learning and parameter learning. For solving this problem, the K2 algorithm is one of the representative data-driven algorithms, which is based on score and search approach. In addition, the integration of the expert's knowledge in the structure learning process allows the obtainment of the highest accuracy. In this paper, we propose a hybrid approach combining the improvement of the K2 algorithm called K2 algorithm for Parents and Children search (K2PC) and the expert-driven method for learning the structure of BN. The evaluation of the experimental results, using the well-known benchmarks, proves that our K2PC algorithm has better performance in terms of correct structure detection. The real application of our model shows its efficiency in the analysis of the phosphate laundry effluents' impact on the watershed in the Gafsa area (southwestern Tunisia).

Efficient HAAR Wavelet Transform with Embedded Zerotrees of Wavelet Compression for Color Images

This study is expected to compress true color image with compression algorithms in color spaces to provide high compression rates. The need of high compression ratio is to improve storage space. Alternative aim is to rank compression algorithms in a suitable color space. The dataset is sequence of true color images with size 128 x 128. HAAR Wavelet is one of the famous wavelet transforms, has great potential and maintains image quality of color images. HAAR wavelet Transform using Set Partitioning in Hierarchical Trees (SPIHT) algorithm with different color spaces framework is applied to compress sequence of images with angles. Embedded Zerotrees of Wavelet (EZW) is a powerful standard method to sequence data. Hence the proposed compression frame work of HAAR wavelet, xyz color space, morphological gradient and applied image with EZW compression, obtained improvement to other methods, in terms of Compression Ratio, Mean Square Error, Peak Signal Noise Ratio and Bits Per Pixel quality measures.

Fundamental Theory of the Evolution Force: Gene Engineering utilizing Synthetic Evolution Artificial Intelligence

The effects of the evolution force are observable in nature at all structural levels ranging from small molecular systems to conversely enormous biospheric systems. However, the evolution force and work associated with formation of biological structures has yet to be described mathematically or theoretically. In addressing the conundrum, we consider evolution from a unique perspective and in doing so we introduce the “Fundamental Theory of the Evolution Force: FTEF”. We utilized synthetic evolution artificial intelligence (SYN-AI) to identify genomic building blocks and to engineer 14-3-3 ζ docking proteins by transforming gene sequences into time-based DNA codes derived from protein hierarchical structural levels. The aforementioned served as templates for random DNA hybridizations and genetic assembly. The application of hierarchical DNA codes allowed us to fast forward evolution, while dampening the effect of point mutations. Natural selection was performed at each hierarchical structural level and mutations screened using Blosum 80 mutation frequency-based algorithms. Notably, SYN-AI engineered a set of three architecturally conserved docking proteins that retained motion and vibrational dynamics of native Bos taurus 14-3-3 ζ.

The Benefits of End-To-End Integrated Planning from the Mine to Client Supply for Minimizing Penalties

The control over delivered iron ore blend characteristics is one of the most important aspects of the mining business. The iron ore price is a function of its composition, which is the outcome of the beneficiation process. So, end-to-end integrated planning of mine operations can reduce risks of penalties on the iron ore price. In a standard iron mining company, the production chain is composed of mining, ore beneficiation, and client supply. When mine planning and client supply decisions are made uncoordinated, the beneficiation plant struggles to deliver the best blend possible. Technological improvements in several fields allowed bridging the gap between departments and boosting integrated decision-making processes. Clusterization and classification algorithms over historical production data generate reasonable previsions for quality and volume of iron ore produced for each pile of run-of-mine (ROM) processed. Mathematical modeling can use those deterministic relations to propose iron ore blends that better-fit specifications within a delivery schedule. Additionally, a model capable of representing the whole production chain can clearly compare the overall impact of different decisions in the process. This study shows how flexibilization combined with a planning optimization model between the mine and the ore beneficiation processes can reduce risks of out of specification deliveries. The model capabilities are illustrated on a hypothetical iron ore mine with magnetic separation process. Finally, this study shows ways of cost reduction or profit increase by optimizing process indicators across the production chain and integrating the different plannings with the sales decisions.

Probabilistic Approach of Dealing with Uncertainties in Distributed Constraint Optimization Problems and Situation Awareness for Multi-agent Systems

In this paper, we describe how Bayesian inferential reasoning will contributes in obtaining a well-satisfied prediction for Distributed Constraint Optimization Problems (DCOPs) with uncertainties. We also demonstrate how DCOPs could be merged to multi-agent knowledge understand and prediction (i.e. Situation Awareness). The DCOPs functions were merged with Bayesian Belief Network (BBN) in the form of situation, awareness, and utility nodes. We describe how the uncertainties can be represented to the BBN and make an effective prediction using the expectation-maximization algorithm or conjugate gradient descent algorithm. The idea of variable prediction using Bayesian inference may reduce the number of variables in agents’ sampling domain and also allow missing variables estimations. Experiment results proved that the BBN perform compelling predictions with samples containing uncertainties than the perfect samples. That is, Bayesian inference can help in handling uncertainties and dynamism of DCOPs, which is the current issue in the DCOPs community. We show how Bayesian inference could be formalized with Distributed Situation Awareness (DSA) using uncertain and missing agents’ data. The whole framework was tested on multi-UAV mission for forest fire searching. Future work focuses on augmenting existing architecture to deal with dynamic DCOPs algorithms and multi-agent information merging.

Secure Socket Layer in the Network and Web Security

In order to electronically exchange information between network users in the web of data, different software such as outlook is presented. So, the traffic of users on a site or even the floors of a building can be decreased as a result of applying a secure and reliable data sharing software. It is essential to provide a fast, secure and reliable network system in the data sharing webs to create an advanced communication systems in the users of network. In the present research work, different encoding methods and algorithms in data sharing systems is studied in order to increase security of data sharing systems by preventing the access of hackers to the transferred data. To increase security in the networks, the possibility of textual conversation between customers of a local network is studied. Application of the encryption and decryption algorithms is studied in order to increase security in networks by preventing hackers from infiltrating. As a result, a reliable and secure communication system between members of a network can be provided by preventing additional traffic in the website environment in order to increase speed, accuracy and security in the network and web systems of data sharing.

Constructing a Bayesian Network for Solar Energy in Egypt Using Life Cycle Analysis and Machine Learning Algorithms

In an era where machines run and shape our world, the need for a stable, non-ending source of energy emerges. In this study, the focus was on the solar energy in Egypt as a renewable source, the most important factors that could affect the solar energy’s market share throughout its life cycle production were analyzed and filtered, the relationships between them were derived before structuring a Bayesian network. Also, forecasted models were built for multiple factors to predict the states in Egypt by 2035, based on historical data and patterns, to be used as the nodes’ states in the network. 37 factors were found to might have an impact on the use of solar energy and then were deducted to 12 factors that were chosen to be the most effective to the solar energy’s life cycle in Egypt, based on surveying experts and data analysis, some of the factors were found to be recurring in multiple stages. The presented Bayesian network could be used later for scenario and decision analysis of using solar energy in Egypt, as a stable renewable source for generating any type of energy needed.

Optimal and Critical Path Analysis of State Transportation Network Using Neo4J

A transportation network is a realization of a spatial network, describing a structure which permits either vehicular movement or flow of some commodity. Examples include road networks, railways, air routes, pipelines, and many more. The transportation network plays a vital role in maintaining the vigor of the nation’s economy. Hence, ensuring the network stays resilient all the time, especially in the face of challenges such as heavy traffic loads and large scale natural disasters, is of utmost importance. In this paper, we used the Neo4j application to develop the graph. Neo4j is the world's leading open-source, NoSQL, a native graph database that implements an ACID-compliant transactional backend to applications. The Southern California network model is developed using the Neo4j application and obtained the most critical and optimal nodes and paths in the network using centrality algorithms. The edge betweenness centrality algorithm calculates the critical or optimal paths using Yen's k-shortest paths algorithm, and the node betweenness centrality algorithm calculates the amount of influence a node has over the network. The preliminary study results confirm that the Neo4j application can be a suitable tool to study the important nodes and the critical paths for the major congested metropolitan area.

Inferential Reasoning for Heterogeneous Multi-Agent Mission

We describe issues bedeviling the coordination of heterogeneous (different sensors carrying agents) multi-agent missions such as belief conflict, situation reasoning, etc. We applied Bayesian and agents' presumptions inferential reasoning to solve the outlined issues with the heterogeneous multi-agent belief variation and situational-base reasoning. Bayesian Belief Network (BBN) was used in modeling the agents' belief conflict due to sensor variations. Simulation experiments were designed, and cases from agents’ missions were used in training the BBN using gradient descent and expectation-maximization algorithms. The output network is a well-trained BBN for making inferences for both agents and human experts. We claim that the Bayesian learning algorithm prediction capacity improves by the number of training data and argue that it enhances multi-agents robustness and solve agents’ sensor conflicts.

Implementation of Channel Estimation and Timing Synchronization Algorithms for MIMO-OFDM System Using NI USRP 2920

MIMO-OFDM communication system presents a key solution for the next generation of mobile communication due to its high spectral efficiency, high data rate and robustness against multi-path fading channels. However, MIMO-OFDM system requires a perfect knowledge of the channel state information and a good synchronization between the transmitter and the receiver to achieve the expected performances. Recently, we have proposed two algorithms for channel estimation and timing synchronization with good performances and very low implementation complexity compared to those proposed in the literature. In order to validate and evaluate the efficiency of these algorithms in real environments, this paper presents in detail the implementation of 2 × 2 MIMO-OFDM system based on LabVIEW and USRP 2920. Implementation results show a good agreement with the simulation results under different configuration parameters.

NANCY: Combining Adversarial Networks with Cycle-Consistency for Robust Multi-Modal Image Registration

Multimodal image registration is a profoundly complex task which is why deep learning has been used widely to address it in recent years. However, two main challenges remain: Firstly, the lack of ground truth data calls for an unsupervised learning approach, which leads to the second challenge of defining a feasible loss function that can compare two images of different modalities to judge their level of alignment. To avoid this issue altogether we implement a generative adversarial network consisting of two registration networks GAB, GBA and two discrimination networks DA, DB connected by spatial transformation layers. GAB learns to generate a deformation field which registers an image of the modality B to an image of the modality A. To do that, it uses the feedback of the discriminator DB which is learning to judge the quality of alignment of the registered image B. GBA and DA learn a mapping from modality A to modality B. Additionally, a cycle-consistency loss is implemented. For this, both registration networks are employed twice, therefore resulting in images ˆA, ˆB which were registered to ˜B, ˜A which were registered to the initial image pair A, B. Thus the resulting and initial images of the same modality can be easily compared. A dataset of liver CT and MRI was used to evaluate the quality of our approach and to compare it against learning and non-learning based registration algorithms. Our approach leads to dice scores of up to 0.80 ± 0.01 and is therefore comparable to and slightly more successful than algorithms like SimpleElastix and VoxelMorph.

Authentication of Physical Objects with Dot-Based 2D Code

Counterfeit goods and documents are a global problem, which needs more and more sophisticated methods of resolving it. Existing techniques using watermarking or embedding symbols on objects are not suitable for all use cases. To address those special needs, we created complete system allowing authentication of paper documents and physical objects with flat surface. Objects are marked using orientation independent and resistant to camera noise 2D graphic codes, named DotAuth. Based on the identifier stored in 2D code, the system is able to perform basic authentication and allows to conduct more sophisticated analysis methods, e.g., relying on augmented reality and physical properties of the object. In this paper, we present the complete architecture, algorithms and applications of the proposed system. Results of the features comparison of the proposed solution and other products are presented as well, pointing to the existence of many advantages that increase usability and efficiency in the means of protecting physical objects.

Review of Strategies for Hybrid Energy Storage Management System in Electric Vehicle Application

Electric Vehicles (EV) appear to be gaining increasing patronage as a feasible alternative to Internal Combustion Engine Vehicles (ICEVs) for having low emission and high operation efficiency. The EV energy storage systems are required to handle high energy and power density capacity constrained by limited space, operating temperature, weight and cost. The choice of strategies for energy storage evaluation, monitoring and control remains a challenging task. This paper presents review of various energy storage technologies and recent researches in battery evaluation techniques used in EV applications. It also underscores strategies for the hybrid energy storage management and control schemes for the improvement of EV stability and reliability. The study reveals that despite the advances recorded in battery technologies there is still no cell which possess both the optimum power and energy densities among other requirements, for EV application. However combination of two or more energy storages as hybrid and allowing the advantageous attributes from each device to be utilized is a promising solution. The review also reveals that State-of-Charge (SoC) is the most crucial method for battery estimation. The conventional method of SoC measurement is however questioned in the literature and adaptive algorithms that include all model of disturbances are being proposed. The review further suggests that heuristic-based approach is commonly adopted in the development of strategies for hybrid energy storage system management. The alternative approach which is optimization-based is found to be more accurate but is memory and computational intensive and as such not recommended in most real-time applications.

Constant Factor Approximation Algorithm for p-Median Network Design Problem with Multiple Cable Types

This research presents the first constant approximation algorithm to the p-median network design problem with multiple cable types. This problem was addressed with a single cable type and there is a bifactor approximation algorithm for the problem. To the best of our knowledge, the algorithm proposed in this paper is the first constant approximation algorithm for the p-median network design with multiple cable types. The addressed problem is a combination of two well studied problems which are p-median problem and network design problem. The introduced algorithm is a random sampling approximation algorithm of constant factor which is conceived by using some random sampling techniques form the literature. It is based on a redistribution Lemma from the literature and a steiner tree problem as a subproblem. This algorithm is simple, and it relies on the notions of random sampling and probability. The proposed approach gives an approximation solution with one constant ratio without violating any of the constraints, in contrast to the one proposed in the literature. This paper provides a (21 + 2)-approximation algorithm for the p-median network design problem with multiple cable types using random sampling techniques.

Machine Learning Techniques in Bank Credit Analysis

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning

We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.

The Delaying Influence of Degradation on the Divestment of Gas Turbines for Associated Gas Utilisation: Part 1

An important feature of the exploitation of associated gas as fuel for gas turbine engines is a declining supply. So when exploiting this resource, the divestment of prime movers is very important as the fuel supply diminishes with time. This paper explores the influence of engine degradation on the timing of divestments. Hypothetical but realistic gas turbine engines were modelled with Turbomatch, the Cranfield University gas turbine performance simulation tool. The results were deployed in three degradation scenarios within the TERA (Techno-economic and environmental risk analysis) framework to develop economic models. An optimisation with Genetic Algorithms was carried out to maximize the economic benefit. The results show that degradation will have a significant impact. It will delay the divestment of power plants, while they are running less efficiently. Over a 20 year investment, a decrease of $0.11bn, $0.26bn and $0.45bn (billion US dollars) were observed for the three degradation scenarios as against the clean case.

Controlling of Multi-Level Inverter under Shading Conditions Using Artificial Neural Network

This paper describes the effects of photovoltaic voltage changes on Multi-level inverter (MLI) due to solar irradiation variations, and methods to overcome these changes. The irradiation variation affects the generated voltage, which in turn varies the switching angles required to turn-on the inverter power switches in order to obtain minimum harmonic content in the output voltage profile. Genetic Algorithm (GA) is used to solve harmonics elimination equations of eleven level inverters with equal and non-equal dc sources. After that artificial neural network (ANN) algorithm is proposed to generate appropriate set of switching angles for MLI at any level of input dc sources voltage causing minimization of the total harmonic distortion (THD) to an acceptable limit. MATLAB/Simulink platform is used as a simulation tool and Fast Fourier Transform (FFT) analyses are carried out for output voltage profile to verify the reliability and accuracy of the applied technique for controlling the MLI harmonic distortion. According to the simulation results, the obtained THD for equal dc source is 9.38%, while for variable or unequal dc sources it varies between 10.26% and 12.93% as the input dc voltage varies between 4.47V nd 11.43V respectively. The proposed ANN algorithm provides satisfied simulation results that match with results obtained by alternative algorithms.

Multi-Objective Optimal Design of a Cascade Control System for a Class of Underactuated Mechanical Systems

This paper presents a multi-objective optimal design of a cascade control system for an underactuated mechanical system. Cascade control structures usually include two control algorithms (inner and outer). To design such a control system properly, the following conflicting objectives should be considered at the same time: 1) the inner closed-loop control must be faster than the outer one, 2) the inner loop should fast reject any disturbance and prevent it from propagating to the outer loop, 3) the controlled system should be insensitive to measurement noise, and 4) the controlled system should be driven by optimal energy. Such a control problem can be formulated as a multi-objective optimization problem such that the optimal trade-offs among these design goals are found. To authors best knowledge, such a problem has not been studied in multi-objective settings so far. In this work, an underactuated mechanical system consisting of a rotary servo motor and a ball and beam is used for the computer simulations, the setup parameters of the inner and outer control systems are tuned by NSGA-II (Non-dominated Sorting Genetic Algorithm), and the dominancy concept is used to find the optimal design points. The solution of this problem is not a single optimal cascade control, but rather a set of optimal cascade controllers (called Pareto set) which represent the optimal trade-offs among the selected design criteria. The function evaluation of the Pareto set is called the Pareto front. The solution set is introduced to the decision-maker who can choose any point to implement. The simulation results in terms of Pareto front and time responses to external signals show the competing nature among the design objectives. The presented study may become the basis for multi-objective optimal design of multi-loop control systems.