Machine Learning Techniques in Bank Credit Analysis

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Vision-Based Daily Routine Recognition for Healthcare with Transfer Learning

We propose to record Activities of Daily Living (ADLs) of elderly people using a vision-based system so as to provide better assistive and personalization technologies. Current ADL-related research is based on data collected with help from non-elderly subjects in laboratory environments and the activities performed are predetermined for the sole purpose of data collection. To obtain more realistic datasets for the application, we recorded ADLs for the elderly with data collected from real-world environment involving real elderly subjects. Motivated by the need to collect data for more effective research related to elderly care, we chose to collect data in the room of an elderly person. Specifically, we installed Kinect, a vision-based sensor on the ceiling, to capture the activities that the elderly subject performs in the morning every day. Based on the data, we identified 12 morning activities that the elderly person performs daily. To recognize these activities, we created a HARELCARE framework to investigate into the effectiveness of existing Human Activity Recognition (HAR) algorithms and propose the use of a transfer learning algorithm for HAR. We compared the performance, in terms of accuracy, and training progress. Although the collected dataset is relatively small, the proposed algorithm has a good potential to be applied to all daily routine activities for healthcare purposes such as evidence-based diagnosis and treatment.

Investigation of Boll Properties on Cotton Picker Machine Performance

Cotton, as a strategic crop, plays an important role in providing human food and clothing need, because of its oil, protein, and fiber. Iran has been one of the largest cotton producers in the world in the past, but unfortunately, for economic reasons, its production is reduced now. One of the ways to reduce the cost of cotton production is to expand the mechanization of cotton harvesting. Iranian farmers do not accept the function of cotton harvesters. One reason for this lack of acceptance of cotton harvesting machines is the number of field losses on these machines. So, the majority of cotton fields are harvested by hand. Although the correct setting of the harvesting machine is very important in the cotton losses, the morphological properties of the cotton plant also affect the performance of cotton harvesters. In this study, the effect of some cotton morphological properties such as the height of the cotton plant, number, and length of sympodial and monopodial branches, boll dimensions, boll weight, number of carpels and bracts angle were evaluated on the performance of cotton picker. In this research, the efficiency of John Deere 9920 spindle Cotton picker is investigated on five different Iranian cotton cultivars. The results indicate that there was a significant difference between the five cultivars in terms of machine harvest efficiency. Golestan cultivar showed the best cotton harvester performance with an average of 87.6% of total harvestable seed cotton and Khorshid cultivar had the least cotton harvester performance. The principal component analysis showed that, at 50.76% probability, the cotton picker efficiency is affected by the bracts angle positively and by boll dimensions, the number of carpels and the height of cotton plants negatively. The seed cotton remains (in the plant and on the ground) after harvester in PCA scatter plot were in the same zone with boll dimensions and several carpels.

Practical Techniques of Improving State Estimator Solution

State Estimator became an intrinsic part of Energy Management Systems (EMS). The SCADA measurements received from the field are processed by the State Estimator in order to accurately determine the actual operating state of the power systems and provide that information to other real-time network applications. All EMS vendors offer a State Estimator functionality in their baseline products. However, setting up and ensuring that State Estimator consistently produces a reliable solution often consumes a substantial engineering effort. This paper provides generic recommendations and describes a simple practical approach to efficient tuning of State Estimator, based on the working experience with major EMS software platforms and consulting projects in many electrical utilities of the USA.

Keyloggers Prevention with Time-Sensitive Obfuscation

Nowadays, the abuse of keyloggers is one of the most widespread approaches to steal sensitive information. In this paper, we propose an On-Screen Prompts Approach to Keyloggers (OSPAK) and its analysis, which is installed in public computers. OSPAK utilizes a canvas to cue users when their keystrokes are going to be logged or ignored by OSPAK. This approach can protect computers against recoding sensitive inputs, which obfuscates keyloggers with letters inserted among users' keystrokes. It adds a canvas below each password field in a webpage and consists of three parts: two background areas, a hit area and a moving foreground object. Letters at different valid time intervals are combined in accordance with their time interval orders, and valid time intervals are interleaved with invalid time intervals. It utilizes animation to visualize valid time intervals and invalid time intervals, which can be integrated in a webpage as a browser extension. We have tested it against a series of known keyloggers and also performed a study with 95 users to evaluate how easily the tool is used. Experimental results made by volunteers show that OSPAK is a simple approach.

Heuristic Methods for the Capacitated Location- Allocation Problem with Stochastic Demand

The proper number and appropriate locations of service centers can save cost, raise revenue and gain more satisfaction from customers. Establishing service centers is high-cost and difficult to relocate. In long-term planning periods, several factors may affect the service. One of the most critical factors is uncertain demand of customers. The opened service centers need to be capable of serving customers and making a profit although the demand in each period is changed. In this work, the capacitated location-allocation problem with stochastic demand is considered. A mathematical model is formulated to determine suitable locations of service centers and their allocation to maximize total profit for multiple planning periods. Two heuristic methods, a local search and genetic algorithm, are used to solve this problem. For the local search, five different chances to choose each type of moves are applied. For the genetic algorithm, three different replacement strategies are considered. The results of applying each method to solve numerical examples are compared. Both methods reach to the same best found solution in most examples but the genetic algorithm provides better solutions in some cases.

Effect of Prefabricated Vertical Drain System Properties on Embankment Behavior

This study presents the effect of prefabricated vertical drain system properties on embankment behavior by calculating the settlement, lateral displacement and induced excess pore pressure by numerical method. In order to investigate this behavior, three different prefabricated vertical drains have been simulated under an embankment. The finite element software PLAXIS has been carried out for analyzing the displacements and excess pore pressures. The results showed that the consolidation time and induced excess pore pressure are highly depended to the discharge capacity of the prefabricated vertical drain. The increase in the discharge capacity leads to decrease the consolidation process and the induced excess pore pressure. Moreover, it was seen that the vertical drains spacing does not have any significant effect on the consolidation time. However, the increase in the drains spacing would decrease the system stiffness.

Ozone Assisted Low Temperature Catalytic Benzene Oxidation over Al2O3, SiO2, AlOOH Supported Ni/Pd Catalytic

Catalytic oxidation of benzene assisted by ozone, on alumina, silica, and boehmite-supported Ni/Pd catalysts was investigated at 353 K to assess the influence of the support on the reaction. Three bimetallic Ni/Pd nanosized samples with loading 4.7% of Ni and 0.17% of Pd supported on SiO2, AlOOH and Al2O3 were synthesized by the extractive-pyrolytic method. The phase composition was characterized by means of XRD and the surface area and pore size were estimated using Brunauer–Emmett–Teller (BET) and Barrett–Joyner–Halenda (BJH) methods. At the beginning of the reaction, catalysts were significantly deactivated due to the accumulation of intermediates on the catalyst surface and after 60 minutes it turned stable. Ni/Pd/AlOOH catalyst showed the highest steady-state activity in comparison with the Ni/Pd/SiO2 and Ni/Pd/Al2O3 catalysts. Their activity depends on the ozone decomposition potential of the catalysts because of generating oxidizing active species. The sample with the highest ozone decomposition ability which correlated to the surface area of the support oxidizes benzene to the highest extent.

Efficiency Analyses of Higher Education in Taiwan: Implications to Higher Education Crisis

This study applies nonparametric DEA to analyze Taiwan’s 46 comprehensive and 73 technical universities from 2012 to 2017. The inter-category comparison of efficient universities percentage reveals that, on the whole, private universities outperform public universities in the same category. In addition, comprehensive universities outperform technical universities. However, the trend analyses confirm that facing the challenge of the higher education crisis, performance improvement is much more urgent for PriCU, PubTECH and PriTECH than for PubCU, especially for PriTECH. The crisis in higher education has hit private universities harder than public ones, and technical universities harder than comprehensive ones, and is worsening fast. Moreover, for PubCU, PubTECH, and PriTECH to better their overall operational efficiency, facilitating management efficiency or innovating teaching and research are equally crucial with optimizing operational scale. Conversely, for PriCU, they should, first of all, put more emphasis on scale efficiency improvement to boom their efficiencies. In terms of scale efficiency, it is required to together consider pure technical efficiency and scale return, and thus seems no merger combinations can better their efficiencies and simultaneously solve their urgent crisis. That thus suggests PriCU, PubTECH, and PriTECH should take other ways, such as to raise income from outputs other than tuition fees, rather than a merger, to reduce the shock as could as possible and thus improve their scale efficiency. Finally, the robustness test suggests consolidated estimation is a more objective and fair evaluation of university efficiency.

Image Haze Removal Using Scene Depth Based Spatially Varying Atmospheric Light in Haar Lifting Wavelet Domain

This paper presents a method for single image dehazing based on dark channel prior (DCP). The property that the intensity of the dark channel gives an approximate thickness of the haze is used to estimate the transmission and atmospheric light. Instead of constant atmospheric light, the proposed method employs scene depth to estimate spatially varying atmospheric light as it truly occurs in nature. Haze imaging model together with the soft matting method has been used in this work to produce high quality haze free image. Experimental results demonstrate that the proposed approach produces better results than the classic DCP approach as color fidelity and contrast of haze free image are improved and no over-saturation in the sky region is observed. Further, lifting Haar wavelet transform is employed to reduce overall execution time by a factor of two to three as compared to the conventional approach.

Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine

The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.

Effect of Plant Nutrients on Anthocyanin Content and Yield Component of Black Glutinous Rice Plants

The cultivation of black glutinous rice rich in anthocyanins can provide great benefits to both farmers and consumers. Total anthocyanins content and yield component data of black glutinous rice cultivar (KHHK) grown with the addition of mineral elements (Ca, Mg, Cu, Cr, Fe and Se) under soilless conditions were studied. Ca application increased seed anthocyanins content by three-folds compared to controls. Cu application to rice plants obtained the highest number of grains panicle, panicle length and subsequently high panicle weight. Se application had the largest effect on leaf anthocyanins content, the number of tillers, number of panicles and 100-grain weight. These findings showed that the addition of mineral elements had a positive effect on increasing anthocyanins content in black rice plants and seeds as well as the heightened development of black glutinous rice plant growth.

Convergence Analysis of Training Two-Hidden-Layer Partially Over-Parameterized ReLU Networks via Gradient Descent

Over-parameterized neural networks have attracted a great deal of attention in recent deep learning theory research, as they challenge the classic perspective of over-fitting when the model has excessive parameters and have gained empirical success in various settings. While a number of theoretical works have been presented to demystify properties of such models, the convergence properties of such models are still far from being thoroughly understood. In this work, we study the convergence properties of training two-hidden-layer partially over-parameterized fully connected networks with the Rectified Linear Unit activation via gradient descent. To our knowledge, this is the first theoretical work to understand convergence properties of deep over-parameterized networks without the equally-wide-hidden-layer assumption and other unrealistic assumptions. We provide a probabilistic lower bound of the widths of hidden layers and proved linear convergence rate of gradient descent. We also conducted experiments on synthetic and real-world datasets to validate our theory.

Multi-Objective Optimal Design of a Cascade Control System for a Class of Underactuated Mechanical Systems

This paper presents a multi-objective optimal design of a cascade control system for an underactuated mechanical system. Cascade control structures usually include two control algorithms (inner and outer). To design such a control system properly, the following conflicting objectives should be considered at the same time: 1) the inner closed-loop control must be faster than the outer one, 2) the inner loop should fast reject any disturbance and prevent it from propagating to the outer loop, 3) the controlled system should be insensitive to measurement noise, and 4) the controlled system should be driven by optimal energy. Such a control problem can be formulated as a multi-objective optimization problem such that the optimal trade-offs among these design goals are found. To authors best knowledge, such a problem has not been studied in multi-objective settings so far. In this work, an underactuated mechanical system consisting of a rotary servo motor and a ball and beam is used for the computer simulations, the setup parameters of the inner and outer control systems are tuned by NSGA-II (Non-dominated Sorting Genetic Algorithm), and the dominancy concept is used to find the optimal design points. The solution of this problem is not a single optimal cascade control, but rather a set of optimal cascade controllers (called Pareto set) which represent the optimal trade-offs among the selected design criteria. The function evaluation of the Pareto set is called the Pareto front. The solution set is introduced to the decision-maker who can choose any point to implement. The simulation results in terms of Pareto front and time responses to external signals show the competing nature among the design objectives. The presented study may become the basis for multi-objective optimal design of multi-loop control systems.

Forecasting Stock Indexes Using Bayesian Additive Regression Tree

Forecasting the stock market is a very challenging task. Various economic indicators such as GDP, exchange rates, interest rates, and unemployment have a substantial impact on the stock market. Time series models are the traditional methods used to predict stock market changes. In this paper, a machine learning method, Bayesian Additive Regression Tree (BART) is used in predicting stock market indexes based on multiple economic indicators. BART can be used to model heterogeneous treatment effects, and thereby works well when models are misspecified. It also has the capability to handle non-linear main effects and multi-way interactions without much input from financial analysts. In this research, BART is proposed to provide a reliable prediction on day-to-day stock market activities. By comparing the analysis results from BART and with time series method, BART can perform well and has better prediction capability than the traditional methods.

Modeling and Analysis of a Cycling Prosthetic

There are currently many people living with limb loss in the USA. The main causes for amputation can range from vascular disease, to trauma, or cancer. This number is expected increase over the next decade. Many patients have a single prosthetic for the first year but end up getting a second one to accommodate their changing physique. Afterwards, the prosthesis gets replaced every three to five years depending on how often it is used. This could cost the patient up to $500,000 throughout their lifetime. Complications do not end there, however. Due to the absence of nerves, it becomes more difficult to traverse terrain with a prosthetic. Moving on an incline or decline becomes difficult, thus curbs and stairs can be a challenge. Certain physical activities, such as cycling, could be even more strenuous. It will need to be relearned to accommodate for the change in weight, center of gravity, and transfer of energy from the leg to the pedal. The purpose of this research project is to develop a new, alternate below-knee cycling prosthetic using Dieter & Schmidt’s design process approach. It will be subjected to fatigue analysis under dynamic loading to observe the limitations as well as the strengths and weaknesses of the prosthetic. Benchmark comparisons will be made between existing prosthetics and the proposed one, examining the benefits and disadvantages. The resulting prosthetic will be 3D printed using acrylonitrile butadiene styrene (ABS) or polycarbonate (PC) plastic.

Digestibility in Yankasa Rams Fed Brachiaria ruziziensis – Centrosema pascuorum Hay Mixtures with Concentrate

This study investigated the digestibility of Brachiaria ruziziensis and Centrosema pascuorum hay mixtures at varying proportions in Yankasa rams. Twelve Yankasa rams with average initial weight 10.25 ± 0.1 kg were assigned to three dietary treatments of B. ruziziensis and C. pascuorum hay at different mixtures (75BR:25CP, 50BR:50CP and 25BR:75CP, respectively) in a Completely Randomized Design (CRD) for a period of 14 days. Concentrate diet was given to the experimental animals as supplement at fixed proportion, while the forage mixture (basal diet) was fed at 3% body weight. Animals on 50BR:50CP had better nutrient digestibility (crude protein, acid and neutral detergent fibre, ether extract and nitrogen free extract) than other treatment diets, except in dry matter digestibility (87.35%) which compared with 87.54% obtained in 25BR:75CP treatment diet and also organic matter digestibility. All parameters taken on nitrogen balance with the exception of nitrogen retained were significantly higher (P < 0.05) in animals fed 25BR:75CP diet, but were statistically similar with values obtained for animals on 50BR:50CP diet. From results obtained in this study, it is concluded that mixture of 25%BR75%CP gave the best nutrient digestibility and nitrogen balance in Yankasa rams. It is therefore recommended that B. ruziziensis and C. pascuorum should be fed at 50:50 mixture ratio for enhanced animal growth and performance in Nigeria.

In situ Real-Time Multivariate Analysis of Methanolysis Monitoring of Sunflower Oil Using FTIR

The combination of world population and the third industrial revolution led to high demand for fuels. On the other hand, the decrease of global fossil 8fuels deposits and the environmental air pollution caused by these fuels has compounded the challenges the world faces due to its need for energy. Therefore, new forms of environmentally friendly and renewable fuels such as biodiesel are needed. The primary analytical techniques for methanolysis yield monitoring have been chromatography and spectroscopy, these methods have been proven reliable but are more demanding, costly and do not provide real-time monitoring. In this work, the in situ monitoring of biodiesel from sunflower oil using FTIR (Fourier Transform Infrared) has been studied; the study was performed using EasyMax Mettler Toledo reactor equipped with a DiComp (Diamond) probe. The quantitative monitoring of methanolysis was performed by building a quantitative model with multivariate calibration using iC Quant module from iC IR 7.0 software. 15 samples of known concentrations were used for the modelling which were taken in duplicate for model calibration and cross-validation, data were pre-processed using mean centering and variance scale, spectrum math square root and solvent subtraction. These pre-processing methods improved the performance indexes from 7.98 to 0.0096, 11.2 to 3.41, 6.32 to 2.72, 0.9416 to 0.9999, RMSEC, RMSECV, RMSEP and R2Cum, respectively. The R2 value of 1 (training), 0.9918 (test), 0.9946 (cross-validation) indicated the fitness of the model built. The model was tested against univariate model; small discrepancies were observed at low concentration due to unmodelled intermediates but were quite close at concentrations above 18%. The software eliminated the complexity of the Partial Least Square (PLS) chemometrics. It was concluded that the model obtained could be used to monitor methanol of sunflower oil at industrial and lab scale.

Significance of Bike-Frame Geometric Factors for Cycling Efficiency and Muscle Activation

With the advocacy of green transportation and green traveling, cycling has become increasingly popular nowadays. Physiology and bike design are key factors for the influence of cycling efficiency. Therefore, this study aimed to investigate the significance of bike-frame geometric factors on cycling efficiency and muscle activation for different body sizes of non-professional Asian male cyclists. Participants who represented various body sizes, as measured by leg and back lengths, carried out cycling tests using a tailor-assembled road bike with different ergonomic design configurations including seat-height adjustments (i.e., 96%, 100%, and 104% of trochanteric height) and bike frame sizes (i.e., small and medium frames) for an assessable distance of 1 km. A specific power meter and self-developed adaptable surface electromyography (sEMG) were used to measure average pedaling power and cadence generated and muscle activation, respectively. The results showed that changing the seat height was far more significant than the body and bike frame sizes. The sEMG data evidently provided a better understanding of muscle activation as a function of different seat heights. Therefore, the interpretation of this study is that the major bike ergonomic design factor dominating the cycling efficiency of Asian participants with different body sizes was the seat height.

Application of Heuristic Integration Ant Colony Optimization in Path Planning

This paper mainly studies the path planning method based on ant colony optimization (ACO), and proposes heuristic integration ant colony optimization (HIACO). This paper not only analyzes and optimizes the principle, but also simulates and analyzes the parameters related to the application of HIACO in path planning. Compared with the original algorithm, the improved algorithm optimizes probability formula, tabu table mechanism and updating mechanism, and introduces more reasonable heuristic factors. The optimized HIACO not only draws on the excellent ideas of the original algorithm, but also solves the problems of premature convergence, convergence to the sub optimal solution and improper exploration to some extent. HIACO can be used to achieve better simulation results and achieve the desired optimization. Combined with the probability formula and update formula, several parameters of HIACO are tested. This paper proves the principle of the HIACO and gives the best parameter range in the research of path planning.