Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

A Web and Cloud-Based Measurement System Analysis Tool for the Automotive Industry

Any industrial company needs to determine the amount of variation that exists within its measurement process and guarantee the reliability of their data, studying the performance of their measurement system, in terms of linearity, bias, repeatability and reproducibility and stability. This issue is critical for automotive industry suppliers, who are required to be certified by the 16949:2016 standard (replaces the ISO/TS 16949) of International Automotive Task Force, defining the requirements of a quality management system for companies in the automotive industry. Measurement System Analysis (MSA) is one of the mandatory tools. Frequently, the measurement system in companies is not connected to the equipment and do not incorporate the methods proposed by the Automotive Industry Action Group (AIAG). To address these constraints, an R&D project is in progress, whose objective is to develop a web and cloud-based MSA tool. This MSA tool incorporates Industry 4.0 concepts, such as, Internet of Things (IoT) protocols to assure the connection with the measuring equipment, cloud computing, artificial intelligence, statistical tools, and advanced mathematical algorithms. This paper presents the preliminary findings of the project. The web and cloud-based MSA tool is innovative because it implements all statistical tests proposed in the MSA-4 reference manual from AIAG as well as other emerging methods and techniques. As it is integrated with the measuring devices, it reduces the manual input of data and therefore the errors. The tool ensures traceability of all performed tests and can be used in quality laboratories and in the production lines. Besides, it monitors MSAs over time, allowing both the analysis of deviations from the variation of the measurements performed and the management of measurement equipment and calibrations. To develop the MSA tool a ten-step approach was implemented. Firstly, it was performed a benchmarking analysis of the current competitors and commercial solutions linked to MSA, concerning Industry 4.0 paradigm. Next, an analysis of the size of the target market for the MSA tool was done. Afterwards, data flow and traceability requirements were analysed in order to implement an IoT data network that interconnects with the equipment, preferably via wireless. The MSA web solution was designed under UI/UX principles and an API in python language was developed to perform the algorithms and the statistical analysis. Continuous validation of the tool by companies is being performed to assure real time management of the ‘big data’. The main results of this R&D project are: MSA Tool, web and cloud-based; Python API; New Algorithms to the market; and Style Guide of UI/UX of the tool. The MSA tool proposed adds value to the state of the art as it ensures an effective response to the new challenges of measurement systems, which are increasingly critical in production processes. Although the automotive industry has triggered the development of this innovative MSA tool, other industries would also benefit from it. Currently, companies from molds and plastics, chemical and food industry are already validating it.

An Observer-Based Direct Adaptive Fuzzy Sliding Control with Adjustable Membership Functions

In this paper, an observer-based direct adaptive fuzzy sliding mode (OAFSM) algorithm is proposed. In the proposed algorithm, the zero-input dynamics of the plant could be unknown. The input connection matrix is used to combine the sliding surfaces of individual subsystems, and an adaptive fuzzy algorithm is used to estimate an equivalent sliding mode control input directly. The fuzzy membership functions, which were determined by time consuming try and error processes in previous works, are adjusted by adaptive algorithms. The other advantage of the proposed controller is that the input gain matrix is not limited to be diagonal, i.e. the plant could be over/under actuated provided that controllability and observability are preserved. An observer is constructed to directly estimate the state tracking error, and the nonlinear part of the observer is constructed by an adaptive fuzzy algorithm. The main advantage of the proposed observer is that, the measured outputs is not limited to the first entry of a canonical-form state vector. The closed-loop stability of the proposed method is proved using a Lyapunov-based approach. The proposed method is applied numerically on a multi-link robot manipulator, which verifies the performance of the closed-loop control. Moreover, the performance of the proposed algorithm is compared with some conventional control algorithms.

Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance

Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.

Automated Heart Sound Classification from Unsegmented Phonocardiogram Signals Using Time Frequency Features

Cardiologists perform cardiac auscultation to detect abnormalities in heart sounds. Since accurate auscultation is a crucial first step in screening patients with heart diseases, there is a need to develop computer-aided detection/diagnosis (CAD) systems to assist cardiologists in interpreting heart sounds and provide second opinions. In this paper different algorithms are implemented for automated heart sound classification using unsegmented phonocardiogram (PCG) signals. Support vector machine (SVM), artificial neural network (ANN) and cartesian genetic programming evolved artificial neural network (CGPANN) without the application of any segmentation algorithm has been explored in this study. The signals are first pre-processed to remove any unwanted frequencies. Both time and frequency domain features are then extracted for training the different models. The different algorithms are tested in multiple scenarios and their strengths and weaknesses are discussed. Results indicate that SVM outperforms the rest with an accuracy of 73.64%.

Meta-Learning for Hierarchical Classification and Applications in Bioinformatics

Hierarchical classification is a special type of classification task where the class labels are organised into a hierarchy, with more generic class labels being ancestors of more specific ones. Meta-learning for classification-algorithm recommendation consists of recommending to the user a classification algorithm, from a pool of candidate algorithms, for a dataset, based on the past performance of the candidate algorithms in other datasets. Meta-learning is normally used in conventional, non-hierarchical classification. By contrast, this paper proposes a meta-learning approach for more challenging task of hierarchical classification, and evaluates it in a large number of bioinformatics datasets. Hierarchical classification is especially relevant for bioinformatics problems, as protein and gene functions tend to be organised into a hierarchy of class labels. This work proposes meta-learning approach for recommending the best hierarchical classification algorithm to a hierarchical classification dataset. This work’s contributions are: 1) proposing an algorithm for splitting hierarchical datasets into new datasets to increase the number of meta-instances, 2) proposing meta-features for hierarchical classification, and 3) interpreting decision-tree meta-models for hierarchical classification algorithm recommendation.

Trust Management for an Authentication System in Ubiquitous Computing

Security of context-aware ubiquitous systems is paramount, and authentication plays an important aspect in cloud computing and ubiquitous computing. Trust management has been identified as vital component for establishing and maintaining successful relational exchanges between trading partners in cloud and ubiquitous systems. Establishing trust is the way to build good relationship with both client and provider which positive activates will increase trust level, otherwise destroy trust immediately. We propose a new context-aware authentication system using a trust management system between client and server, and between servers, a trust which induces partnership, thus to a close cooperation between these servers. We defined the rules (algorithms), as well as the formulas to manage and calculate the trusting degrees depending on context, in order to uniquely authenticate a user, thus a single sign-on, and to provide him better services.

Deep Learning for Renewable Power Forecasting: An Approach Using LSTM Neural Networks

Load forecasting has become crucial in recent years and become popular in forecasting area. Many different power forecasting models have been tried out for this purpose. Electricity load forecasting is necessary for energy policies, healthy and reliable grid systems. Effective power forecasting of renewable energy load leads the decision makers to minimize the costs of electric utilities and power plants. Forecasting tools are required that can be used to predict how much renewable energy can be utilized. The purpose of this study is to explore the effectiveness of LSTM-based neural networks for estimating renewable energy loads. In this study, we present models for predicting renewable energy loads based on deep neural networks, especially the Long Term Memory (LSTM) algorithms. Deep learning allows multiple layers of models to learn representation of data. LSTM algorithms are able to store information for long periods of time. Deep learning models have recently been used to forecast the renewable energy sources such as predicting wind and solar energy power. Historical load and weather information represent the most important variables for the inputs within the power forecasting models. The dataset contained power consumption measurements are gathered between January 2016 and December 2017 with one-hour resolution. Models use publicly available data from the Turkish Renewable Energy Resources Support Mechanism. Forecasting studies have been carried out with these data via deep neural networks approach including LSTM technique for Turkish electricity markets. 432 different models are created by changing layers cell count and dropout. The adaptive moment estimation (ADAM) algorithm is used for training as a gradient-based optimizer instead of SGD (stochastic gradient). ADAM performed better than SGD in terms of faster convergence and lower error rates. Models performance is compared according to MAE (Mean Absolute Error) and MSE (Mean Squared Error). Best five MAE results out of 432 tested models are 0.66, 0.74, 0.85 and 1.09. The forecasting performance of the proposed LSTM models gives successful results compared to literature searches.

Unsteady 3D Post-Stall Aerodynamics Accounting for Effective Loss in Camber Due to Flow Separation

The current study couples a quasi-steady Vortex Lattice Method and a camber correcting technique, ‘Decambering’ for unsteady post-stall flow prediction. The wake is force-free and discrete such that the wake lattices move with the free-stream once shed from the wing. It is observed that the time-averaged unsteady coefficient of lift sees a relative drop at post-stall angles of attack in comparison to its steady counterpart for some angles of attack. Multiple solutions occur at post-stall and three different algorithms to choose solutions in these regimes show both unsteadiness and non-convergence of the iterations. The distribution of coefficient of lift on the wing span also shows sawtooth. Distribution of vorticity changes both along span and in the direction of the free-stream as the wake develops over time with distinct roll-up, which increases with time.

Modeling and Dynamics Analysis for Intelligent Skid-Steering Vehicle Based on Trucksim-Simulink

Aiming at the verification of control algorithms for skid-steering vehicles, a vehicle simulation model of 6×6 electric skid-steering unmanned vehicle was established based on Trucksim and Simulink. The original transmission and steering mechanism of Trucksim are removed, and the electric skid-steering model and a closed-loop controller for the vehicle speed and yaw rate are built in Simulink. The simulation results are compared with the ones got by theoretical formulas. The results show that the predicted tire mechanics and vehicle kinematics of Trucksim-Simulink simulation model are closed to the theoretical results. Therefore, it can be used as an effective approach to study the dynamic performance and control algorithm of skid-steering vehicle. In this paper, a method of motion control based on feed forward control is also designed. The simulation results show that the feed forward control strategy can make the vehicle follow the target yaw rate more quickly and accurately, which makes the vehicle have more maneuverability.

A Review on Comparative Analysis of Path Planning and Collision Avoidance Algorithms

Autonomous mobile robots (AMR) are expected as smart tools for operations in every automation industry. Path planning and obstacle avoidance is the backbone of AMR as robots have to reach their goal location avoiding obstacles while traversing through optimized path defined according to some criteria such as distance, time or energy. Path planning can be classified into global and local path planning where environmental information is known and unknown/partially known, respectively. A number of sensors are used for data collection. A number of algorithms such as artificial potential field (APF), rapidly exploring random trees (RRT), bidirectional RRT, Fuzzy approach, Purepursuit, A* algorithm, vector field histogram (VFH) and modified local path planning algorithm, etc. have been used in the last three decades for path planning and obstacle avoidance for AMR. This paper makes an attempt to review some of the path planning and obstacle avoidance algorithms used in the field of AMR. The review includes comparative analysis of simulation and mathematical computations of path planning and obstacle avoidance algorithms using MATLAB 2018a. From the review, it could be concluded that different algorithms may complete the same task (i.e. with a different set of instructions) in less or more time, space, effort, etc.

Hand Gesture Detection via EmguCV Canny Pruning

Hand gesture recognition is a technique used to locate, detect, and recognize a hand gesture. Detection and recognition are concepts of Artificial Intelligence (AI). AI concepts are applicable in Human Computer Interaction (HCI), Expert systems (ES), etc. Hand gesture recognition can be used in sign language interpretation. Sign language is a visual communication tool. This tool is used mostly by deaf societies and those with speech disorder. Communication barriers exist when societies with speech disorder interact with others. This research aims to build a hand recognition system for Lesotho’s Sesotho and English language interpretation. The system will help to bridge the communication problems encountered by the mentioned societies. The system has various processing modules. The modules consist of a hand detection engine, image processing engine, feature extraction, and sign recognition. Detection is a process of identifying an object. The proposed system uses Canny pruning Haar and Haarcascade detection algorithms. Canny pruning implements the Canny edge detection. This is an optimal image processing algorithm. It is used to detect edges of an object. The system employs a skin detection algorithm. The skin detection performs background subtraction, computes the convex hull, and the centroid to assist in the detection process. Recognition is a process of gesture classification. Template matching classifies each hand gesture in real-time. The system was tested using various experiments. The results obtained show that time, distance, and light are factors that affect the rate of detection and ultimately recognition. Detection rate is directly proportional to the distance of the hand from the camera. Different lighting conditions were considered. The more the light intensity, the faster the detection rate. Based on the results obtained from this research, the applied methodologies are efficient and provide a plausible solution towards a light-weight, inexpensive system which can be used for sign language interpretation.

A Speeded up Robust Scale-Invariant Feature Transform Currency Recognition Algorithm

All currencies around the world look very different from each other. For instance, the size, color, and pattern of the paper are different. With the development of modern banking services, automatic methods for paper currency recognition become important in many applications like vending machines. One of the currency recognition architecture’s phases is Feature detection and description. There are many algorithms that are used for this phase, but they still have some disadvantages. This paper proposes a feature detection algorithm, which merges the advantages given in the current SIFT and SURF algorithms, which we call, Speeded up Robust Scale-Invariant Feature Transform (SR-SIFT) algorithm. Our proposed SR-SIFT algorithm overcomes the problems of both the SIFT and SURF algorithms. The proposed algorithm aims to speed up the SIFT feature detection algorithm and keep it robust. Simulation results demonstrate that the proposed SR-SIFT algorithm decreases the average response time, especially in small and minimum number of best key points, increases the distribution of the number of best key points on the surface of the currency. Furthermore, the proposed algorithm increases the accuracy of the true best point distribution inside the currency edge than the other two algorithms.

A Mixed Integer Linear Programming Model for Flexible Job Shop Scheduling Problem

In this paper, a mixed integer linear programming (MILP) model is presented to solve the flexible job shop scheduling problem (FJSP). This problem is one of the hardest combinatorial problems. The objective considered is the minimization of the makespan. The computational results of the proposed MILP model were compared with those of the best known mathematical model in the literature in terms of the computational time. The results show that our model has better performance with respect to all the considered performance measures including relative percentage deviation (RPD) value, number of constraints, and total number of variables. By this improved mathematical model, larger FJS problems can be optimally solved in reasonable time, and therefore, the model would be a better tool for the performance evaluation of the approximation algorithms developed for the problem.

CBIR Using Multi-Resolution Transform for Brain Tumour Detection and Stages Identification

Image retrieval is the most interesting technique which is being used today in our digital world. CBIR, commonly expanded as Content Based Image Retrieval is an image processing technique which identifies the relevant images and retrieves them based on the patterns that are extracted from the digital images. In this paper, two research works have been presented using CBIR. The first work provides an automated and interactive approach to the analysis of CBIR techniques. CBIR works on the principle of supervised machine learning which involves feature selection followed by training and testing phase applied on a classifier in order to perform prediction. By using feature extraction, the image transforms such as Contourlet, Ridgelet and Shearlet could be utilized to retrieve the texture features from the images. The features extracted are used to train and build a classifier using the classification algorithms such as Naïve Bayes, K-Nearest Neighbour and Multi-class Support Vector Machine. Further the testing phase involves prediction which predicts the new input image using the trained classifier and label them from one of the four classes namely 1- Normal brain, 2- Benign tumour, 3- Malignant tumour and 4- Severe tumour. The second research work includes developing a tool which is used for tumour stage identification using the best feature extraction and classifier identified from the first work. Finally, the tool will be used to predict tumour stage and provide suggestions based on the stage of tumour identified by the system. This paper presents these two approaches which is a contribution to the medical field for giving better retrieval performance and for tumour stages identification.

Particle Swarm Optimization and Quantum Particle Swarm Optimization to Multidimensional Function Approximation

This work compares the results of multidimensional function approximation using two algorithms: the classical Particle Swarm Optimization (PSO) and the Quantum Particle Swarm Optimization (QPSO). These algorithms were both tested on three functions - The Rosenbrock, the Rastrigin, and the sphere functions - with different characteristics by increasing their number of dimensions. As a result, this study shows that the higher the function space, i.e. the larger the function dimension, the more evident the advantages of using the QPSO method compared to the PSO method in terms of performance and number of necessary iterations to reach the stop criterion.

Crude Oil Price Prediction Using LSTM Networks

Crude oil market is an immensely complex and dynamic environment and thus the task of predicting changes in such an environment becomes challenging with regards to its accuracy. A number of approaches have been adopted to take on that challenge and machine learning has been at the core in many of them. There are plenty of examples of algorithms based on machine learning yielding satisfactory results for such type of prediction. In this paper, we have tried to predict crude oil prices using Long Short-Term Memory (LSTM) based recurrent neural networks. We have tried to experiment with different types of models using different epochs, lookbacks and other tuning methods. The results obtained are promising and presented a reasonably accurate prediction for the price of crude oil in near future.

A Construction Management Tool: Determining a Project Schedule Typical Behaviors Using Cluster Analysis

Delays in the construction industry are a global phenomenon. Many construction projects experience extensive delays exceeding the initially estimated completion time. The main purpose of this study is to identify construction projects typical behaviors in order to develop a prognosis and management tool. Being able to know a construction projects schedule tendency will enable evidence-based decision-making to allow resolutions to be made before delays occur. This study presents an innovative approach that uses Cluster Analysis Method to support predictions during Earned Value Analyses. A clustering analysis was used to predict future scheduling, Earned Value Management (EVM), and Earned Schedule (ES) principal Indexes behaviors in construction projects. The analysis was made using a database with 90 different construction projects. It was validated with additional data extracted from literature and with another 15 contrasting projects. For all projects, planned and executed schedules were collected and the EVM and ES principal indexes were calculated. A complete linkage classification method was used. In this way, the cluster analysis made considers that the distance (or similarity) between two clusters must be measured by its most disparate elements, i.e. that the distance is given by the maximum span among its components. Finally, through the use of EVM and ES Indexes and Tukey and Fisher Pairwise Comparisons, the statistical dissimilarity was verified and four clusters were obtained. It can be said that construction projects show an average delay of 35% of its planned completion time. Furthermore, four typical behaviors were found and for each of the obtained clusters, the interim milestones and the necessary rhythms of construction were identified. In general, detected typical behaviors are: (1) Projects that perform a 5% of work advance in the first two tenths and maintain a constant rhythm until completion (greater than 10% for each remaining tenth), being able to finish on the initially estimated time. (2) Projects that start with an adequate construction rate but suffer minor delays culminating with a total delay of almost 27% of the planned time. (3) Projects which start with a performance below the planned rate and end up with an average delay of 64%, and (4) projects that begin with a poor performance, suffer great delays and end up with an average delay of a 120% of the planned completion time. The obtained clusters compose a tool to identify the behavior of new construction projects by comparing their current work performance to the validated database, thus allowing the correction of initial estimations towards more accurate completion schedules.

A Numerical Description of a Fibre Reinforced Concrete Using a Genetic Algorithm

This work reports about an approach for an automatic adaptation of concrete formulations based on genetic algorithms (GA) to optimize a wide range of different fit-functions. In order to achieve the goal, a method was developed which provides a numerical description of a fibre reinforced concrete (FRC) mixture regarding the production technology and the property spectrum of the concrete. In a first step, the FRC mixture with seven fixed components was characterized by varying amounts of the components. For that purpose, ten concrete mixtures were prepared and tested. The testing procedure comprised flow spread, compressive and bending tensile strength. The analysis and approximation of the determined data was carried out by GAs. The aim was to obtain a closed mathematical expression which best describes the given seven-point cloud of FRC by applying a Gene Expression Programming with Free Coefficients (GEP-FC) strategy. The seven-parametric FRC-mixtures model which is generated according to this method correlated well with the measured data. The developed procedure can be used for concrete mixtures finding closed mathematical expressions, which are based on the measured data.

Pre- and Post-Analyses of Disruptive Quay Crane Scheduling Problem

In the past, the quay crane operations have been well studied. There were a certain number of scheduling algorithms for quay crane operations, but without considering some nuisance factors that might disrupt the quay crane operations. For example, bad grapples make a crane unable to load or unload containers or a sudden strong breeze stops operations temporarily. Although these disruptive conditions randomly occur, they influence the efficiency of quay crane operations. The disruption is not considered in the operational procedures nor is evaluated in advance for its impacts. This study applies simulation and optimization approaches to develop structures of pre-analysis and post-analysis for the Quay Crane Scheduling Problem to deal with disruptive scenarios for quay crane operation. Numerical experiments are used for demonstrations for the validity of the developed approaches.