Convergence Analysis of Training Two-Hidden-Layer Partially Over-Parameterized ReLU Networks via Gradient Descent

Over-parameterized neural networks have attracted a great deal of attention in recent deep learning theory research, as they challenge the classic perspective of over-fitting when the model has excessive parameters and have gained empirical success in various settings. While a number of theoretical works have been presented to demystify properties of such models, the convergence properties of such models are still far from being thoroughly understood. In this work, we study the convergence properties of training two-hidden-layer partially over-parameterized fully connected networks with the Rectified Linear Unit activation via gradient descent. To our knowledge, this is the first theoretical work to understand convergence properties of deep over-parameterized networks without the equally-wide-hidden-layer assumption and other unrealistic assumptions. We provide a probabilistic lower bound of the widths of hidden layers and proved linear convergence rate of gradient descent. We also conducted experiments on synthetic and real-world datasets to validate our theory.

Aliasing Free and Additive Error in Spectra for Alpha Stable Signals

This work focuses on the symmetric alpha stable process with continuous time frequently used in modeling the signal with indefinitely growing variance, often observed with an unknown additive error. The objective of this paper is to estimate this error from discrete observations of the signal. For that, we propose a method based on the smoothing of the observations via Jackson polynomial kernel and taking into account the width of the interval where the spectral density is non-zero. This technique allows avoiding the “Aliasing phenomenon” encountered when the estimation is made from the discrete observations of a process with continuous time. We have studied the convergence rate of the estimator and have shown that the convergence rate improves in the case where the spectral density is zero at the origin. Thus, we set up an estimator of the additive error that can be subtracted for approaching the original signal without error.

Numerical Solution of Steady Magnetohydrodynamic Boundary Layer Flow Due to Gyrotactic Microorganism for Williamson Nanofluid over Stretched Surface in the Presence of Exponential Internal Heat Generation

This paper focuses on the study of two dimensional magnetohydrodynamic (MHD) steady incompressible viscous Williamson nanofluid with exponential internal heat generation containing gyrotactic microorganism over a stretching sheet. The governing equations and auxiliary conditions are reduced to a set of non-linear coupled differential equations with the appropriate boundary conditions using similarity transformation. The transformed equations are solved numerically through spectral relaxation method. The influences of various parameters such as Williamson parameter γ, power constant λ, Prandtl number Pr, magnetic field parameter M, Peclet number Pe, Lewis number Le, Bioconvection Lewis number Lb, Brownian motion parameter Nb, thermophoresis parameter Nt, and bioconvection constant σ are studied to obtain the momentum, heat, mass and microorganism distributions. Moment, heat, mass and gyrotactic microorganism profiles are explored through graphs and tables. We computed the heat transfer rate, mass flux rate and the density number of the motile microorganism near the surface. Our numerical results are in better agreement in comparison with existing calculations. The Residual error of our obtained solutions is determined in order to see the convergence rate against iteration. Faster convergence is achieved when internal heat generation is absent. The effect of magnetic parameter M decreases the momentum boundary layer thickness but increases the thermal boundary layer thickness. It is apparent that bioconvection Lewis number and bioconvection parameter has a pronounced effect on microorganism boundary. Increasing brownian motion parameter and Lewis number decreases the thermal boundary layer. Furthermore, magnetic field parameter and thermophoresis parameter has an induced effect on concentration profiles.

Adaptive Filtering in Subbands for Supervised Source Separation

This paper investigates MIMO (Multiple-Input Multiple-Output) adaptive filtering techniques for the application of supervised source separation in the context of convolutive mixtures. From the observation that there is correlation among the signals of the different mixtures, an improvement in the NSAF (Normalized Subband Adaptive Filter) algorithm is proposed in order to accelerate its convergence rate. Simulation results with mixtures of speech signals in reverberant environments show the superior performance of the proposed algorithm with respect to the performances of the NLMS (Normalized Least-Mean-Square) and conventional NSAF, considering both the convergence speed and SIR (Signal-to-Interference Ratio) after convergence.

A Transform Domain Function Controlled VSSLMS Algorithm for Sparse System Identification

The convergence rate of the least-mean-square (LMS) algorithm deteriorates if the input signal to the filter is correlated. In a system identification problem, this convergence rate can be improved if the signal is white and/or if the system is sparse. We recently proposed a sparse transform domain LMS-type algorithm that uses a variable step-size for a sparse system identification. The proposed algorithm provided high performance even if the input signal is highly correlated. In this work, we investigate the performance of the proposed TD-LMS algorithm for a large number of filter tap which is also a critical issue for standard LMS algorithm. Additionally, the optimum value of the most important parameter is calculated for all experiments. Moreover, the convergence analysis of the proposed algorithm is provided. The performance of the proposed algorithm has been compared to different algorithms in a sparse system identification setting of different sparsity levels and different number of filter taps. Simulations have shown that the proposed algorithm has prominent performance compared to the other algorithms.

Ramp Rate and Constriction Factor Based Dual Objective Economic Load Dispatch Using Particle Swarm Optimization

Economic Load Dispatch (ELD) proves to be a vital optimization process in electric power system for allocating generation amongst various units to compute the cost of generation, the cost of emission involving global warming gases like sulphur dioxide, nitrous oxide and carbon monoxide etc. In this dissertation, we emphasize ramp rate constriction factor based particle swarm optimization (RRCPSO) for analyzing various performance objectives, namely cost of generation, cost of emission, and a dual objective function involving both these objectives through the experimental simulated results. A 6-unit 30 bus IEEE test case system has been utilized for simulating the results involving improved weight factor advanced ramp rate limit constraints for optimizing total cost of generation and emission. This method increases the tendency of particles to venture into the solution space to ameliorate their convergence rates. Earlier works through dispersed PSO (DPSO) and constriction factor based PSO (CPSO) give rise to comparatively higher computational time and less good optimal solution at par with current dissertation. This paper deals with ramp rate and constriction factor based well defined ramp rate PSO to compute various objectives namely cost, emission and total objective etc. and compares the result with DPSO and weight improved PSO (WIPSO) techniques illustrating lesser computational time and better optimal solution. 

Affine Projection Adaptive Filter with Variable Regularization

We propose two affine projection algorithms (APA) with variable regularization parameter. The proposed algorithms dynamically update the regularization parameter that is fixed in the conventional regularized APA (R-APA) using a gradient descent based approach. By introducing the normalized gradient, the proposed algorithms give birth to an efficient and a robust update scheme for the regularization parameter. Through experiments we demonstrate that the proposed algorithms outperform conventional R-APA in terms of the convergence rate and the misadjustment error.

Variable Regularization Parameter Normalized Least Mean Square Adaptive Filter

We present a normalized LMS (NLMS) algorithm with robust regularization. Unlike conventional NLMS with the fixed regularization parameter, the proposed approach dynamically updates the regularization parameter. By exploiting a gradient descent direction, we derive a computationally efficient and robust update scheme for the regularization parameter. In simulation, we demonstrate the proposed algorithm outperforms conventional NLMS algorithms in terms of convergence rate and misadjustment error.

An Intelligent Text Independent Speaker Identification Using VQ-GMM Model Based Multiple Classifier System

Speaker Identification (SI) is the task of establishing identity of an individual based on his/her voice characteristics. The SI task is typically achieved by two-stage signal processing: training and testing. The training process calculates speaker specific feature parameters from the speech and generates speaker models accordingly. In the testing phase, speech samples from unknown speakers are compared with the models and classified. Even though performance of speaker identification systems has improved due to recent advances in speech processing techniques, there is still need of improvement. In this paper, a Closed-Set Tex-Independent Speaker Identification System (CISI) based on a Multiple Classifier System (MCS) is proposed, using Mel Frequency Cepstrum Coefficient (MFCC) as feature extraction and suitable combination of vector quantization (VQ) and Gaussian Mixture Model (GMM) together with Expectation Maximization algorithm (EM) for speaker modeling. The use of Voice Activity Detector (VAD) with a hybrid approach based on Short Time Energy (STE) and Statistical Modeling of Background Noise in the pre-processing step of the feature extraction yields a better and more robust automatic speaker identification system. Also investigation of Linde-Buzo-Gray (LBG) clustering algorithm for initialization of GMM, for estimating the underlying parameters, in the EM step improved the convergence rate and systems performance. It also uses relative index as confidence measures in case of contradiction in identification process by GMM and VQ as well. Simulation results carried out on voxforge.org speech database using MATLAB highlight the efficacy of the proposed method compared to earlier work.

Improving the Performance of Back-Propagation Training Algorithm by Using ANN

Artificial Neural Network (ANN) can be trained using back propagation (BP). It is the most widely used algorithm for supervised learning with multi-layered feed-forward networks. Efficient learning by the BP algorithm is required for many practical applications. The BP algorithm calculates the weight changes of artificial neural networks, and a common approach is to use a twoterm algorithm consisting of a learning rate (LR) and a momentum factor (MF). The major drawbacks of the two-term BP learning algorithm are the problems of local minima and slow convergence speeds, which limit the scope for real-time applications. Recently the addition of an extra term, called a proportional factor (PF), to the two-term BP algorithm was proposed. The third increases the speed of the BP algorithm. However, the PF term also reduces the convergence of the BP algorithm, and criteria for evaluating convergence are required to facilitate the application of the three terms BP algorithm. Although these two seem to be closely related, as described later, we summarize various improvements to overcome the drawbacks. Here we compare the different methods of convergence of the new three-term BP algorithm.

Comparison of Two Types of Preconditioners for Stokes and Linearized Navier-Stokes Equations

To solve saddle point systems efficiently, several preconditioners have been published. There are many methods for constructing preconditioners for linear systems from saddle point problems, for instance, the relaxed dimensional factorization (RDF) preconditioner and the augmented Lagrangian (AL) preconditioner are used for both steady and unsteady Navier-Stokes equations. In this paper we compare the RDF preconditioner with the modified AL (MAL) preconditioner to show which is more effective to solve Navier-Stokes equations. Numerical experiments indicate that the MAL preconditioner is more efficient and robust, especially, for moderate viscosities and stretched grids in steady problems. For unsteady cases, the convergence rate of the RDF preconditioner is slightly faster than the MAL perconditioner in some circumstances, but the parameter of the RDF preconditioner is more sensitive than the MAL preconditioner. Moreover the convergence rate of the MAL preconditioner is still quite acceptable. Therefore we conclude that the MAL preconditioner is more competitive than the RDF preconditioner. These experiments are implemented with IFISS package. 

GMDH Modeling Based on Polynomial Spline Estimation and Its Applications

GMDH algorithm can well describe the internal structure of objects. In the process of modeling, automatic screening of model structure and variables ensure the convergence rate.This paper studied a new GMDH model based on polynomial spline  stimation. The polynomial spline function was used to instead of the transfer function of GMDH to characterize the relationship between the input variables and output variables. It has proved that the algorithm has the optimal convergence rate under some conditions. The empirical results show that the algorithm can well forecast Consumer Price Index (CPI).

Global Exponential Stability of Impulsive BAM Fuzzy Cellular Neural Networks with Time Delays in the Leakage Terms

In this paper, a class of impulsive BAM fuzzy cellular neural networks with time delays in the leakage terms is formulated and investigated. By establishing a delay differential inequality and M-matrix theory, some sufficient conditions ensuring the existence, uniqueness and global exponential stability of equilibrium point for impulsive BAM fuzzy cellular neural networks with time delays in the leakage terms are obtained. In particular, a precise estimate of the exponential convergence rate is also provided, which depends on system parameters and impulsive perturbation intention. It is believed that these results are significant and useful for the design and applications of BAM fuzzy cellular neural networks. An example is given to show the effectiveness of the results obtained here.

Affine Projection Algorithm with Variable Data-Reuse Factor

This paper suggests a new Affine Projection (AP) algorithm with variable data-reuse factor using the condition number as a decision factor. To reduce computational burden, we adopt a recently reported technique which estimates the condition number of an input data matrix. Several simulations show that the new algorithm has better performance than that of the conventional AP algorithm.

Performance Analysis of a Series of Adaptive Filters in Non-Stationary Environment for Noise Cancelling Setup

One of the essential components of much of DSP application is noise cancellation. Changes in real time signals are quite rapid and swift. In noise cancellation, a reference signal which is an approximation of noise signal (that corrupts the original information signal) is obtained and then subtracted from the noise bearing signal to obtain a noise free signal. This approximation of noise signal is obtained through adaptive filters which are self adjusting. As the changes in real time signals are abrupt, this needs adaptive algorithm that converges fast and is stable. Least mean square (LMS) and normalized LMS (NLMS) are two widely used algorithms because of their plainness in calculations and implementation. But their convergence rates are small. Adaptive averaging filters (AFA) are also used because they have high convergence, but they are less stable. This paper provides the comparative study of LMS and Normalized NLMS, AFA and new enhanced average adaptive (Average NLMS-ANLMS) filters for noise cancelling application using speech signals.

Comparison of Particle Swarm Optimization and Genetic Algorithm for TCSC-based Controller Design

Recently, genetic algorithms (GA) and particle swarm optimization (PSO) technique have attracted considerable attention among various modern heuristic optimization techniques. Since the two approaches are supposed to find a solution to a given objective function but employ different strategies and computational effort, it is appropriate to compare their performance. This paper presents the application and performance comparison of PSO and GA optimization techniques, for Thyristor Controlled Series Compensator (TCSC)-based controller design. The design objective is to enhance the power system stability. The design problem of the FACTS-based controller is formulated as an optimization problem and both the PSO and GA optimization techniques are employed to search for optimal controller parameters. The performance of both optimization techniques in terms of computational time and convergence rate is compared. Further, the optimized controllers are tested on a weakly connected power system subjected to different disturbances, and their performance is compared with the conventional power system stabilizer (CPSS). The eigenvalue analysis and non-linear simulation results are presented and compared to show the effectiveness of both the techniques in designing a TCSC-based controller, to enhance power system stability.

Stability Analysis of Impulsive BAM Fuzzy Cellular Neural Networks with Distributed Delays and Reaction-diffusion Terms

In this paper, a class of impulsive BAM fuzzy cellular neural networks with distributed delays and reaction-diffusion terms is formulated and investigated. By employing the delay differential inequality and inequality technique developed by Xu et al., some sufficient conditions ensuring the existence, uniqueness and global exponential stability of equilibrium point for impulsive BAM fuzzy cellular neural networks with distributed delays and reaction-diffusion terms are obtained. In particular, the estimate of the exponential convergence rate is also provided, which depends on system parameters, diffusion effect and impulsive disturbed intention. It is believed that these results are significant and useful for the design and applications of BAM fuzzy cellular neural networks. An example is given to show the effectiveness of the results obtained here.

New Subband Adaptive IIR Filter Based On Polyphase Decomposition

We present a subband adaptive infinite-impulse response (IIR) filtering method, which is based on a polyphase decomposition of IIR filter. Motivated by the fact that the polyphase structure has benefits in terms of convergence rate and stability, we introduce the polyphase decomposition to subband IIR filtering, i.e., in each subband high order IIR filter is decomposed into polyphase IIR filters with lower order. Computer simulations demonstrate that the proposed method has improved convergence rate over conventional IIR filters.

Preconditioned Mixed-Type Splitting Iterative Method For Z-Matrices

In this paper, we present the preconditioned mixed-type splitting iterative method for solving the linear systems, Ax = b, where A is a Z-matrix. And we give some comparison theorems to show that the convergence rate of the preconditioned mixed-type splitting iterative method is faster than that of the mixed-type splitting iterative method. Finally, we give a numerical example to illustrate our results.

Comparison of Three Versions of Conjugate Gradient Method in Predicting an Unknown Irregular Boundary Profile

An inverse geometry problem is solved to predict an unknown irregular boundary profile. The aim is to minimize the objective function, which is the difference between real and computed temperatures, using three different versions of Conjugate Gradient Method. The gradient of the objective function, considered necessary in this method, obtained as a result of solving the adjoint equation. The abilities of three versions of Conjugate Gradient Method in predicting the boundary profile are compared using a numerical algorithm based on the method. The predicted shapes show that due to its convergence rate and accuracy of predicted values, the Powell-Beale version of the method is more effective than the Fletcher-Reeves and Polak –Ribiere versions.