Predictions of Dynamic Behaviors for Gas Foil Bearings Operating at Steady-State Based on Multi-Physics Coupling Computer Aided Engineering Simulations

A simulation scheme of rotational motions for predictions of bump-type gas foil bearings operating at steady-state is proposed. The scheme is based on multi-physics coupling computer aided engineering packages modularized with computational fluid dynamic model and structure elasticity model to numerically solve the dynamic equation of motions of a hydrodynamic loaded shaft supported by an elastic bump foil. The bump foil is assumed to be modelled as infinite number of Hookean springs mounted on stiff wall. Hence, the top foil stiffness is constant on the periphery of the bearing housing. The hydrodynamic pressure generated by the air film lubrication transfers to the top foil and induces elastic deformation needed to be solved by a finite element method program, whereas the pressure profile applied on the top foil must be solved by a finite element method program based on Reynolds Equation in lubrication theory. As a result, the equation of motions for the bearing shaft are iteratively solved via coupling of the two finite element method programs simultaneously. In conclusion, the two-dimensional center trajectory of the shaft plus the deformation map on top foil at constant rotational speed are calculated for comparisons with the experimental results.

Forecasting 24-Hour Ahead Electricity Load Using Time Series Models

Forecasting electricity load is important for various purposes like planning, operation and control. Forecasts can save operating and maintenance costs, increase the reliability of power supply and delivery systems, and correct decisions for future development. This paper compares various time series methods to forecast 24 hours ahead of electricity load. The methods considered are the Holt-Winters smoothing, SARIMA Modeling, LSTM Network, Fbprophet and Tensorflow probability. The performance of each method is evaluated by using the forecasting accuracy criteria namely, the Mean Absolute Error and Root Mean Square Error. The National Renewable Energy Laboratory (NREL) residential energy consumption data are used to train the models. The results of this study show that SARIMA model is superior to the others for 24 hours ahead forecasts. Furthermore, a Bagging technique is used to make the predictions more robust. The obtained results show that by Bagging multiple time-series forecasts we can improve the robustness of the models for 24 hour ahead electricity load forecasting.

Time Series Forecasting Using Various Deep Learning Models

Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed length window in the past as an explicit input. In this paper, we study how the performance of predictive models change as a function of different look-back window sizes and different amounts of time to predict into the future. We also consider the performance of the recent attention-based transformer models, which had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (Recurrent Neural Network (RNN), Long Short-term Memory (LSTM), Gated Recurrent Units (GRU), and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the website of University of California, Irvine (UCI), which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean   Absolute Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.

A Comparison of Air Pollution in Developed and Developing Cities: A Case Study of London and Beijing

With the rapid development of industrialization, countries in different stages of development in the world have gradually begun to pay attention to the impact of air pollution on health and the environment. Air control in developed countries is an effective reference for air control in developing countries. Artificial intelligence and other technologies also play a positive role in the prediction of air pollution. By comparing the annual changes of pollution in London and Beijing, this paper concludes that the pollution in developed cities is relatively low and stable, while the pollution in Beijing is relatively heavy and unstable, but is clearly improving. In addition, by analyzing the changes of major pollutants in Beijing in the past eight years, it is concluded that all pollutants except O3 show a significant downward trend. In addition, all pollutants except O3 have certain correlation. For example, PM10 and PM2.5 have the greatest influence on air quality index (AQI). Python, which is commonly used by artificial intelligence, is used as the main software to establish two models, support vector machine (SVM) and linear regression. By comparing the two models under the same conditions, it is concluded that SVM has higher accuracy in pollution prediction. The results of this study provide valuable reference for pollution control and prediction in developing countries.

Using Statistical Significance and Prediction to Test Long/Short Term Public Services and Patients Cohorts: A Case Study in Scotland

Health and Social care (HSc) services planning and scheduling are facing unprecedented challenges, due to the pandemic pressure and also suffer from unplanned spending that is negatively impacted by the global financial crisis. Data-driven approaches can help to improve policies, plan and design services provision schedules using algorithms that assist healthcare managers to face unexpected demands using fewer resources. The paper discusses services packing using statistical significance tests and machine learning (ML) to evaluate demands similarity and coupling. This is achieved by predicting the range of the demand (class) using ML methods such as Classification and Regression Trees (CART), Random Forests (RF), and Logistic Regression (LGR). The significance tests Chi-Squared and Student’s test are used on data over a 39 years span for which data exist for services delivered in Scotland. The demands are associated using probabilities and are parts of statistical hypotheses. These hypotheses, as their NULL part, assume that the target demand is statistically dependent on other services’ demands. This linking is checked using the data. In addition, ML methods are used to linearly predict the above target demands from the statistically found associations and extend the linear dependence of the target’s demand to independent demands forming, thus, groups of services. Statistical tests confirmed ML coupling and made the prediction statistically meaningful and proved that a target service can be matched reliably to other services while ML showed that such marked relationships can also be linear ones. Zero padding was used for missing years records and illustrated better such relationships both for limited years and for the entire span offering long-term data visualizations while limited years periods explained how well patients numbers can be related in short periods of time or that they can change over time as opposed to behaviours across more years. The prediction performance of the associations were measured using metrics such as Receiver Operating Characteristic (ROC), Area Under Curve (AUC) and Accuracy (ACC) as well as the statistical tests Chi-Squared and Student. Co-plots and comparison tables for the RF, CART, and LGR methods as well as the p-value from tests and Information Exchange (IE/MIE) measures are provided showing the relative performance of ML methods and of the statistical tests as well as the behaviour using different learning ratios. The impact of k-neighbours classification (k-NN), Cross-Correlation (CC) and C-Means (CM) first groupings was also studied over limited years and for the entire span. It was found that CART was generally behind RF and LGR but in some interesting cases, LGR reached an AUC = 0 falling below CART, while the ACC was as high as 0.912 showing that ML methods can be confused by zero-padding or by data’s irregularities or by the outliers. On average, 3 linear predictors were sufficient, LGR was found competing well RF and CART followed with the same performance at higher learning ratios. Services were packed only when a significance level (p-value) of their association coefficient was more than 0.05. Social factors relationships were observed between home care services and treatment of old people, low birth weights, alcoholism, drug abuse, and emergency admissions. The work found  that different HSc services can be well packed as plans of limited duration, across various services sectors, learning configurations, as confirmed by using statistical hypotheses.

A Large Dataset Imputation Approach Applied to Country Conflict Prediction Data

This study demonstrates an alternative stochastic imputation approach for large datasets when preferred commercial packages struggle to iterate due to numerical problems. A large country conflict dataset motivates the search to impute missing values well over a common threshold of 20% missingness. The methodology capitalizes on correlation while using model residuals to provide the uncertainty in estimating unknown values. Examination of the methodology provides insight toward choosing linear or nonlinear modeling terms. Static tolerances common in most packages are replaced with tailorable tolerances that exploit residuals to fit each data element. The methodology evaluation includes observing computation time, model fit, and the comparison of known  values to replaced values created through imputation. Overall, the country conflict dataset illustrates promise with modeling first-order interactions, while presenting a need for further refinement that mimics predictive mean matching.

Stock Movement Prediction Using Price Factor and Deep Learning

The development of machine learning methods and techniques has opened doors for investigation in many areas such as medicines, economics, finance, etc. One active research area involving machine learning is stock market prediction. This research paper tries to consider multiple techniques and methods for stock movement prediction using historical price or price factors. The paper explores the effectiveness of some deep learning frameworks for forecasting stock. Moreover, an architecture (TimeStock) is proposed which takes the representation of time into account apart from the price information itself. Our model achieves a promising result that shows a potential approach for the stock movement prediction problem.

A Procedure to Assess Streamflow Rating Curves and Streamflow Sequences

This study aims to provide sub-hourly streamflow predictions and associated rating curves for small catchments of intermittent and torrential flow regime characterized by flash floods occurring especially during April and November. The methodology entails two lumped conceptual hydrological models which work in series. The total model is based upon eleven parameters and shows good flexibility in handling different input sets. Runoff Coefficient has contributed to improving the model’s performances and has been treated as an additional parameter; while Sensitivity Analysis has highlighted how slight changes in the model’s input can lead to changes in model’s output. The adopted procedure is steady and useful to give very practical engineering information at the expense of a parsimonious request both in input data and in the number of adopted parameters. According to the obtained results, the authors encourage the test of this combined procedure on different hydrological scenarios in order to provide information for poorly monitored catchments and not updated sites.

Injury Prediction for Soccer Players Using Machine Learning

Injuries in professional sports occur on a regular basis. Some may be minor while others can cause huge impact on a player’s career and earning potential. In soccer, there is a high risk of players picking up injuries during game time. This research work seeks to help soccer players reduce the risk of getting injured by predicting the likelihood of injury while playing in the near future and then providing recommendations for intervention. The injury prediction tool will use a soccer player’s number of minutes played on the field, number of appearances, distance covered and performance data for the current and previous seasons as variables to conduct statistical analysis and provide injury predictive results using a machine learning linear regression model.

Early Age Behavior of Wind Turbine Gravity Foundations

Wind turbine gravity foundations are designed to resist overturning failure through gravitational forces resulting from their masses. Owing to the relatively high volume of the cementitious material present, the foundations tend to suffer thermal strains and internal cracking due to high temperatures and temperature gradients depending on factors such as geometry, mix design and level of restraint. This is a result of a fully coupled mechanism commonly known as THMC (Thermo- Hygro - Mechanical - Chemical) coupling whose kinetics peak during the early age of concrete. The focus of this paper is therefore to present and offer a discussion on the temperature and humidity evolutions occurring in mass pours such as wind turbine gravity foundations based on sensor results obtained from the monitoring of an actual wind turbine foundation. To offer prediction of the evolutions, the formulation of a 3D Thermal-Hydro-Chemical (THC) model that is mainly derived from classical fundamental physical laws is also presented and discussed. The THC model can be mathematically fully coupled in Finite Element analyses. In the current study, COMSOL Multi-physics software was used to simulate the 3D THC coupling that occurred in the monitored wind turbine foundation to predict the temperature evolution at five different points within the foundation from time of casting.

A Risk Assessment Tool for the Contamination of Aflatoxins on Dried Figs based on Machine Learning Algorithms

Aflatoxins are highly poisonous and carcinogenic compounds produced by species of the genus Aspergillus spp. that can infect a variety of agricultural foods, including dried figs. Biological and environmental factors, such as population, pathogenicity and aflatoxinogenic capacity of the strains, topography, soil and climate parameters of the fig orchards are believed to have a strong effect on aflatoxin levels. Existing methods for aflatoxin detection and measurement, such as high-performance liquid chromatography (HPLC), and enzyme-linked immunosorbent assay (ELISA), can provide accurate results, but the procedures are usually time-consuming, sample-destructive and expensive. Predicting aflatoxin levels prior to crop harvest is useful for minimizing the health and financial impact of a contaminated crop. Consequently, there is interest in developing a tool that predicts aflatoxin levels based on topography and soil analysis data of fig orchards. This paper describes the development of a risk assessment tool for the contamination of aflatoxin on dried figs, based on the location and altitude of the fig orchards, the population of the fungus Aspergillus spp. in the soil, and soil parameters such as pH, saturation percentage (SP), electrical conductivity (EC), organic matter, particle size analysis (sand, silt, clay), concentration of the exchangeable cations (Ca, Mg, K, Na), extractable P and trace of elements (B, Fe, Mn, Zn and Cu), by employing machine learning methods. In particular, our proposed method integrates three machine learning techniques i.e., dimensionality reduction on the original dataset (Principal Component Analysis), metric learning (Mahalanobis Metric for Clustering) and K-nearest Neighbors learning algorithm (KNN), into an enhanced model, with mean performance equal to 85% by terms of the Pearson Correlation Coefficient (PCC) between observed and predicted values.

A Comprehensive Survey on Machine Learning Techniques and User Authentication Approaches for Credit Card Fraud Detection

With the increase of credit card usage, the volume of credit card misuse also has significantly increased, which may cause appreciable financial losses for both credit card holders and financial organizations issuing credit cards. As a result, financial organizations are working hard on developing and deploying credit card fraud detection methods, in order to adapt to ever-evolving, increasingly sophisticated defrauding strategies and identifying illicit transactions as quickly as possible to protect themselves and their customers. Compounding on the complex nature of such adverse strategies, credit card fraudulent activities are rare events compared to the number of legitimate transactions. Hence, the challenge to develop fraud detection that are accurate and efficient is substantially intensified and, as a consequence, credit card fraud detection has lately become a very active area of research. In this work, we provide a survey of current techniques most relevant to the problem of credit card fraud detection. We carry out our survey in two main parts. In the first part, we focus on studies utilizing classical machine learning models, which mostly employ traditional transnational features to make fraud predictions. These models typically rely on some static physical characteristics, such as what the user knows (knowledge-based method), or what he/she has access to (object-based method). In the second part of our survey, we review more advanced techniques of user authentication, which use behavioral biometrics to identify an individual based on his/her unique behavior while he/she is interacting with his/her electronic devices. These approaches rely on how people behave (instead of what they do), which cannot be easily forged. By providing an overview of current approaches and the results reported in the literature, this survey aims to drive the future research agenda for the community in order to develop more accurate, reliable and scalable models of credit card fraud detection.

Cirrhosis Mortality Prediction as Classification Using Frequent Subgraph Mining

In this work, we use machine learning and data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. Our work applies modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Additive Friction Stir Manufacturing Process: Interest in Understanding Thermal Phenomena and Numerical Modeling of the Temperature Rise Phase

Additive Friction Stir Manufacturing, or AFSM, is a new industrial process that follows the emergence of friction-based processes. The AFSM process is a solid-state additive process using the energy produced by the friction at the interface between a rotating non-consumable tool and a substrate. Friction depends on various parameters like axial force, rotation speed or friction coefficient. The feeder material is a metallic rod that flows through a hole in the tool. There is still a lack in understanding of the physical phenomena taking place during the process. This research aims at a better AFSM process understanding and implementation, thanks to numerical simulation and experimental validation performed on a prototype effector. Such an approach is considered a promising way for studying the influence of the process parameters and to finally identify a process window that seems relevant. The deposition of material through the AFSM process takes place in several phases. In chronological order these phases are the docking phase, the dwell time phase, the deposition phase, and the removal phase. The present work focuses on the dwell time phase that enables the temperature rise of the system due to pure friction. An analytic modeling of heat generation based on friction considers as main parameters the rotational speed and the contact pressure. Another parameter considered influential is the friction coefficient assumed to be variable, due to the self-lubrication of the system with the rise in temperature or the materials in contact roughness smoothing over time. This study proposes through a numerical modeling followed by an experimental validation to question the influence of the various input parameters on the dwell time phase. Rotation speed, temperature, spindle torque and axial force are the main monitored parameters during experimentations and serve as reference data for the calibration of the numerical model. This research shows that the geometry of the tool as well as fluctuations of the input parameters like axial force and rotational speed are very influential on the temperature reached and/or the time required to reach the targeted temperature. The main outcome is the prediction of a process window which is a key result for a more efficient process implementation.

Geometric Simplification Method of Building Energy Model Based on Building Performance Simulation

In the design stage of a new building, the energy model of this building is often required for the analysis of the performance on energy efficiency. In practice, a certain degree of geometric simplification should be done in the establishment of building energy models, since the detailed geometric features of a real building are hard to be described perfectly in most energy simulation engine, such as ESP-r, eQuest or EnergyPlus. Actually, the detailed description is not necessary when the result with extremely high accuracy is not demanded. Therefore, this paper analyzed the relationship between the error of the simulation result from building energy models and the geometric simplification of the models. Finally, the following two parameters are selected as the indices to characterize the geometric feature of in building energy simulation: the southward projected area and total side surface area of the building. Based on the parameterization method, the simplification from an arbitrary column building to a typical shape (a cuboid) building can be made for energy modeling. The result in this study indicates that no more than 7% prediction error of annual cooling/heating load will be caused by the geometric simplification for those buildings with the ratio of southward projection length to total perimeter of the bottom of 0.25~0.35, which means this method is applicable for building performance simulation.

Fatigue Life Prediction on Steel Beam Bridges under Variable Amplitude Loading

Steel bridges are normally subjected to random loads with different traffic frequencies. They are structures with dynamic behavior and are subject to fatigue failure process, where the nucleation of a crack, growth and failure can occur. After locating and determining the size of an existing fault, it is important to predict the crack propagation and the convenient time for repair. Therefore, fracture mechanics and fatigue concepts are essential to the right approach to the problem. To study the fatigue crack growth, a computational code was developed by using the root mean square (RMS) and the cycle-by-cycle models. One observes the variable amplitude loading influence on the life structural prediction. Different loads histories and initial crack length were considered as input variables. Thus, it was evaluated the dispersion of results of the expected structural life choosing different initial parameters.

IntelligentLogger: A Heavy-Duty Vehicles Fleet Management System Based on IoT and Smart Prediction Techniques

Both daily and long-term management of a heavy-duty vehicles and construction machinery fleet is an extremely complicated and hard to solve issue. This is mainly due to the diversity of the fleet vehicles – machinery, which concerns not only the vehicle types, but also their age/efficiency, as well as the fleet volume, which is often of the order of hundreds or even thousands of vehicles/machineries. In the present paper we present “InteligentLogger”, a holistic heavy-duty fleet management system covering a wide range of diverse fleet vehicles. This is based on specifically designed hardware and software for the automated vehicle health status and operational cost monitoring, for smart maintenance. InteligentLogger is characterized by high adaptability that permits to be tailored to practically any heavy-duty vehicle/machinery (of different technologies -modern or legacy- and of dissimilar uses). Contrary to conventional logistic systems, which are characterized by raised operational costs and often errors, InteligentLogger provides a cost-effective and reliable integrated solution for the e-management and e-maintenance of the fleet members. The InteligentLogger system offers the following unique features that guarantee successful heavy-duty vehicles/machineries fleet management: (a) Recording and storage of operating data of motorized construction machinery, in a reliable way and in real time, using specifically designed Internet of Things (IoT) sensor nodes that communicate through the available network infrastructures, e.g., 3G/LTE; (b) Use on any machine, regardless of its age, in a universal way; (c) Flexibility and complete customization both in terms of data collection, integration with 3rd party systems, as well as in terms of processing and drawing conclusions; (d) Validation, error reporting & correction, as well as update of the system’s database; (e) Artificial intelligence (AI) software, for processing information in real time, identifying out-of-normal behavior and generating alerts; (f) A MicroStrategy based enterprise BI, for modeling information and producing reports, dashboards, and alerts focusing on vehicles– machinery optimal usage, as well as maintenance and scraping policies; (g) Modular structure that allows low implementation costs in the basic fully functional version, but offers scalability without requiring a complete system upgrade.

An Approach for Coagulant Dosage Optimization Using Soft Jar Test: A Case Study of Bangkhen Water Treatment Plant

The most important process of the water treatment plant process is coagulation, which uses alum and poly aluminum chloride (PACL). Therefore, determining the dosage of alum and PACL is the most important factor to be prescribed. This research applies an artificial neural network (ANN), which uses the Levenberg–Marquardt algorithm to create a mathematical model (Soft Jar Test) for chemical dose prediction, as used for coagulation, such as alum and PACL, with input data consisting of turbidity, pH, alkalinity, conductivity, and, oxygen consumption (OC) of the Bangkhen Water Treatment Plant (BKWTP), under the authority of the Metropolitan Waterworks Authority of Thailand. The data were collected from 1 January 2019 to 31 December 2019 in order to cover the changing seasons of Thailand. The input data of ANN are divided into three groups: training set, test set, and validation set. The coefficient of determination and the mean absolute errors of the alum model are 0.73, 3.18 and the PACL model are 0.59, 3.21, respectively.

Classification of Extreme Ground-Level Ozone Based on Generalized Extreme Value Model for Air Monitoring Station

Higher ground-level ozone (GLO) concentration adversely affects human health, vegetations as well as activities in the ecosystem. In Malaysia, most of the analysis on GLO concentration are carried out using the average value of GLO concentration, which refers to the centre of distribution to make a prediction or estimation. However, analysis which focuses on the higher value or extreme value in GLO concentration is rarely explored. Hence, the objective of this study is to classify the tail behaviour of GLO using generalized extreme value (GEV) distribution estimation the return level using the corresponding modelling (Gumbel, Weibull, and Frechet) of GEV distribution. The results show that Weibull distribution which is also known as short tail distribution and considered as having less extreme behaviour is the best-fitted distribution for four selected air monitoring stations in Peninsular Malaysia, namely Larkin, Pelabuhan Kelang, Shah Alam, and Tanjung Malim; while Gumbel distribution which is considered as a medium tail distribution is the best-fitted distribution for Nilai station. The return level of GLO concentration in Shah Alam station is comparatively higher than other stations. Overall, return levels increase with increasing return periods but the increment depends on the type of the tail of GEV distribution’s tail. We conduct this study by using maximum likelihood estimation (MLE) method to estimate the parameters at four selected stations in Peninsular Malaysia. Next, the validation for the fitted block maxima series to GEV distribution is performed using probability plot, quantile plot and likelihood ratio test. Profile likelihood confidence interval is tested to verify the type of GEV distribution. These results are important as a guide for early notification on future extreme ozone events.

Machine Learning for Music Aesthetic Annotation Using MIDI Format: A Harmony-Based Classification Approach

Swimming with the tide of deep learning, the field of music information retrieval (MIR) experiences parallel development and a sheer variety of feature-learning models has been applied to music classification and tagging tasks. Among those learning techniques, the deep convolutional neural networks (CNNs) have been widespreadly used with better performance than the traditional approach especially in music genre classification and prediction. However, regarding the music recommendation, there is a large semantic gap between the corresponding audio genres and the various aspects of a song that influence user preference. In our study, aiming to bridge the gap, we strive to construct an automatic music aesthetic annotation model with MIDI format for better comparison and measurement of the similarity between music pieces in the way of harmonic analysis. We use the matrix of qualification converted from MIDI files as input to train two different classifiers, support vector machine (SVM) and Decision Tree (DT). Experimental results in performance of a tag prediction task have shown that both learning algorithms are capable of extracting high-level properties in an end-to end manner from music information. The proposed model is helpful to learn the audience taste and then the resulting recommendations are likely to appeal to a niche consumer.