Improving the Performance of Back-Propagation Training Algorithm by Using ANN

Artificial Neural Network (ANN) can be trained using
back propagation (BP). It is the most widely used algorithm for
supervised learning with multi-layered feed-forward networks.
Efficient learning by the BP algorithm is required for many practical
applications. The BP algorithm calculates the weight changes of
artificial neural networks, and a common approach is to use a twoterm
algorithm consisting of a learning rate (LR) and a momentum
factor (MF). The major drawbacks of the two-term BP learning
algorithm are the problems of local minima and slow convergence
speeds, which limit the scope for real-time applications. Recently the
addition of an extra term, called a proportional factor (PF), to the
two-term BP algorithm was proposed. The third increases the speed
of the BP algorithm. However, the PF term also reduces the
convergence of the BP algorithm, and criteria for evaluating
convergence are required to facilitate the application of the three
terms BP algorithm. Although these two seem to be closely related,
as described later, we summarize various improvements to overcome
the drawbacks. Here we compare the different methods of
convergence of the new three-term BP algorithm.





References:
[1] Bryson, A.E., and Ho, Yu-Chi, Applied Optimal Control, Blaisdell, New
York, 1969.
[2] Werbos, Paul J., Beyond Regression: “New Tools for Prediction and
Analysis in the Behavioral Science,” Ph.D. Thesis, Applied
Mathematics, Harvard University, November, 1974.
[3] Parker, David B., "Optimal Algorithms for Adaptive Networks: Second
Order Back Propagation, Second Order Direct Propagation, and second
Order Hebbian Learning", Proc. 1987 IEEE International Conference on
Neural Networks, II (593-600), IEEE Press, New York, 1987.
[4] Parker, David B., "A comparison of Algorithms for Neuron-Like Cells”,
in Denker, John (Ed.), proc. Second Annual Conference on Neural
Network for Computing, Proceeding vol. 151, pp. 327-332, America
Institute of Physics, New York, 1986.
[5] Parker, David B., "Learning-Logic," Technical Report TR-47, Center for
Computational Research in Economics and Management Science, MIT,
April 1985.
[6] D.E. Rumelhart, G.E. Hilton, R.J. Williams, “Learning Representations
by Backpropagation Error,” Nature 323, pp. 533-536, 1986.
[7] A. Hadjiprocopis, “Feed Forward Neural Network Entities,” Ph.D.
Thesis, Department of Computer Science, City University, London, UK,
2000.
[8] C. Goerick, W.V. Seelen, “On Unlearnable Problems or A Model for
Premature Saturation in Back- Propagation Learning,” Proceedings of
the European Symposium on Artificial Neural Networks ’96, Brugge,
Belgium, 24–26 April 1996, pp.13–18.
[9] X.G. Wang, Z. Tang, H. Tamura, M. Ishii and W.D. Sun, “An improved
Backpropagation Algorithm to Avoid Local Minima Problem,”
Neurocomputing, vol. 56, pp. 455-460, 2004.
[10] Faridesh Fazayeli, Lipo Wang and Wen Lui, “Back-propagation with
Chaos,” IEEE Int. conference Neural Network & Signal Processing,
Zhenjiang, China, pp. 5-8, June 8-10, 2008.
[11] J. J. Hopfield, “Neurons with graded response have collective
computational properties like those of two-state neurons,” Proc. National
Academic Science USA, vol. 81, pp. 3088-3092, May 1984. [12] L. N. Chen and K. Aihara, “Chaotic simulated annealing by a neural
network model with transient chaos,” Neural Network, vol. 8, no. 6, pp.
915-930, 1995.
[13] L. P. Wang and K. Smith, “On chaotic simulated annealing,” IEEE
Trans. Neural Network, vol. 9, no. 4, pp. 716-718, Jul. 1998.
[14] L. P. Wang, S. Li, F. Y. Tian, and X. J. Fu, “A noisy chaotic neural
network for solving combinatorial optimization problems: Stochastic
chaotic simulated annealing,” IEEE Trans. System, Man, Cybern. B,
Cybern, vol. 34, no. 5, pp. 2119-2125, May 2004.
[15] Kavita Burse, Manish Manoria, Vishnu P. S. Kirar, “Improved Back
Propagation Algorithm to Avoid Local Minima in Multiplicative Neuron
Model,” World Academy of Science and Technology, vol. 72, pp. 429-
432, 2010.
[16] Kavita Burse, R.N. Yadav, S.C. Srivastava, Vishnu P. S. Kirar, “A
Compact Pi Network for Reducing Bit Error Rate in Dispersive FIR
Channel Noise Model,” International Journal of Electrical and Computer
Engineering, vol. 4:11, pp.697-700, 2009.
[17] B. Mel, “Information processing in dendritic trees,” Neural Computing,
vol.6, pp.1031-1085, 1994.
[18] R. N. Yadav, V. Singh and P. K. Kalra, “Classification using single
neuron,” Proc. IEEE Int. Conf. on Industrial Informatics, pp. 124-129,
21-24 August, 2003, Banff, Alberta, Canada.
[19] C.C. Yu, and B.D. Liu, ”A back propagation algorithm with adaptive
learning rate and momentum coefficient,” Proc. of the International Joint
Conference on Neural Networks, IJCNN 2002, pp.1218-1223, May
2007.
[20] X. H. Yu, G. A, Chen, “On the local minima free condition of
backpropagation learning,” IEEE Transactions on Neural Networks, vol.
6, no. 5, Sept. 1995, pp. 1300-1303.
[21] S. C. Ng, S. H. Leung and A. Luk, “A Hybrid Algorithm of Weight
Evolution and Generalized Back- propagationfor finding Global
Minimum”, Proceedings of IEEE International Joint Conference on
Neural Networks (IJCNN’99), 1999.
[22] X.H. Yu, “Can backpropagation error surface not have local minima,”
IEEE Transactions on Neural Networks, vol. 3, no. 6, Nov. 1992, pp.
1019-1021.
[23] M. Riedmiller, and H. Braun, “A direct adaptive method for faster backpropagation
learning: The RPROP Algorithm,” Proceedings of
International Conference on Neural Networks, vol 1, pp. 586-591, 1993.
[24] S.E. Fahlman, “Fast learning variations on back-propagation: An
empirical study,” Proceedings of the 1988 Connectionist Models
Summer School (Pittsburgh, 1988), D. Touretzky, G. Hinton, and T.
Sejnowski, eds., pp. 38-51. Morgan Kaufmann, San Mateo, California,
1989.
[25] D. E. Rumelhart, G.E. Hinton, and R.J. Williams, “Learning internal
representations by error propagation, in Parallel Distributed Processing:
Exploration in the Microstructure of Cognition,” vol. 1. MIT Press,
Cambridge, Mass, 1986.
[26] S.C. Ng, S.H. Leung, “On solving the local minima problem of adaptive
learning by using deterministic weight evolution algorithm,” IEEE
Proceedings of the 2001 Congress on Evolutionary Computation, vol. 1,
pp. 251 – 255, 2001.
[27] M. Gori, and A. Tesi, “On the problem of local minima in hackpropagation,”
IEEE Transactionson Pattern Analysis and Machine
Intelligence,vol. 14,no. 1, pp. 7686, 1992.
[28] Yahya H. Zweiri, “Optimization of a Three-Term backpropagation
Algorithm used for Neural Network Learning,” International Journal of
Engineering and Mathematical science, vol. 3:4, pp. 322-327, 2007.
[29] J.E. Vitela and J. Reifman, “Premature Saturation in Backpropagation
Networks: Mechanism and Necessary Conditions,” Neural Networks,
vol. 10, no. 4, pp. 121-135, 1997.
[30] Z. Zainuddin, N. Mahat and Y. Abu Hassan, “Improving the
Convergence of the backpropagation Algorithm using Local Adaptive
Techniques,” World Academy of Science, Engineering and Technology,
vol. 1, pp. 79-82, 2005.
[31] S.C. Ng, C.C. Cheung, S.H. Leung, A. Luk, “ Fast convergence for
backpropagation network with magnified gradient function,” IEEE
Proceedings of The International Joint Conference on Neural Network
2003,vol.3, pp.1903-1908, 2003.